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ALIGNMENTS 



RESULT 1 
AAW94291 

ID AAW94291 standard; protein; 269 AA. 
XX 

AC AAW94291; 
XX 

DT 27-APR-1999 (first entry) 
XX 

DE Human beta-amyloid peptide-binding protein (BBP) . 
XX 

KW Beta-amyloid peptide binding protein; BBP; beta-amyloid protein; BAP; 

KW human; Alzheimer's disease. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 



FT 
FT 
FT 



Region 



68. .269 



/note= "specifically claimed fragment having beta-amyloid 
peptide binding activity" 



XX 

PN W09846636-A2. 
XX 

PD 22-OCT-1998. 
XX 

PF 14-APR-1998; 98WO-US007462 . 
XX 

PR 16-APR-1997; 97US-0064583P . 
XX 

PA (AMHP ) AMERICAN HOME PROD CORP. 
XX 

PI Ozenberger BA, Kajkowski EM, Jacobsen JS, Bard JA, Walker SG; 
XX 

DR WPI; 1999-080736/07. 

DR N-PSDB; AAX05735. 
XX 

PT Polynucleotide encoding beta-amyloid peptide binding protein - used to 

PT identify inhibitors of beta-amyloid peptide for treating Alzheimer's 

PT disease. 
XX 

PS Claim 7; Page 43-44; 59pp; English. 
XX 

CC The present sequence represents a beta-amyloid peptide binding protein 

CC (BBP) . The polynucleotide comprising the entire BBP nucleotide sequence 

CC of clone BBPl-fl is deposited under the accession number ATCC 98617. The 

CC polynucleotide comprising a fragment of BBP (nucleotides 202-807 of the 

CC full length BBP) of clone pEK196 is deposited as ATCC 98399. Host cells 

CC transformed with a vector comprising the BBP nucleic acid are used for 

CC the recombinant production of the protein. The protein can be used in a 

CC method for diagnosing a disease characterised by aberrant expression of 

CC human beta-amyloid protein (BAP) . The protein can also be used in a 

CC method for screening for compounds which regulate expression of a BAP 

CC binding protein. The proteins, antibodies and identified compounds can be 

CC used in the treatment or prevention of Alzheimer f s disease 
XX 

SQ Sequence 269 AA; 

Query Match 100.0%; Score 1439; DB 2; Length 269; 
Best Local Similarity 100.0%; Pred. No. 8e-141; 

Matches 269; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 




Db 



1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 



61 S KMAAAW P S G P SAP EAVT ARL VGVLW FVS VT T G PWGAVAT S AGGE E S L KC E DLKVGQ Y I C 120 




Db 



61 S KMAAAWP S G P SAP EAVTARLVGVLW FVS VTT G PW GAVAT S AGGE E S L KCE D LKVGQ Y I C 120 



QY 



121 KDPKINDATQEPWCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 180 




Db 



121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 180 



QY 



181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 24 0 



Db 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

Qy 241 S S YI I DYYGTRLTRLS ITNETFRKTQLYP 269 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 241 S SYI IDYYGTRLTRLS ITNETFRKTQLYP 269 

RESULT 2 
AAY70759 



ID AAY70759 standard; protein; 269 AA. 
XX 

AC AAY70759; 
XX 

DT 24-JUL-2000 (first entry) 
XX 

DE Human beta-amyloid peptide (BAP) binding protein, BBPl. 
XX 

KW Beta-amyloid peptide binding protein; BBP; BAP; tumour; suppressor; 
KW G-protein coupled receptor; GPCR; integral membrane protein; antigen; 
KW neuronal cell; nonhuman primate; NHP; G-protein signalling pathway; 
KW apoptosis; immunogen; therapeutic; treatment; prevention; diagnostic. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 
FT Domain 177. .198 

FT /label= Transmembrane_domain_l 

FT Domain 199. .201 

FT /label= DRFjnotif 

FT /note= "Substitution of the Arg abrogates protection" 

FT Domain 213. .238 

FT /label= Transmembrane_domain__2 

XX 

PN WO200022125-A2 . 
XX 

PD 20-APR-2000. 
XX 

PF 13-OCT-1999; 99WO-US021621 . 
XX 

PR 13-OCT-1998; 98US-0104104P . 
XX 

PA (AMHP ) AMERICAN HOME PROD CORP. 
XX 

PI Ozenberger BA, Kajkowski EM, Lo CF; 
XX 

DR WPI; 2000-317982/27. 
DR N-PSDB; AAZ52369. 
XX 

PT Novel G-protein-coupled receptor-like proteins and polynucleotides useful 
PT for regulating apoptosis, comprises integral membrane protein traversing 
PT the membrane twice. 
XX 

PS Example 1; Page 62-63; 68pp; English. 
XX 

CC The present sequence is the beta-amyloid peptide (BAP) binding protein-1 
CC (BBPl) . It is an integral membrane protein, that traverse the membrane 



CC twice. It is related to G protein-coupled receptor (GPCR) protein 

CC super family. It interacts with G-alpha proteins and regulates the 

CC activity of G-protein signalling pathways. BBP genes are widely expressed 

CC in neuronal cells of nonhuman primate (NHP) brain and overexpressed in 

CC some tumours. It functions as a suppressor of apoptosis induction. BBP 

CC proteins are used as immunogens to raise antibodies, useful as 

CC therapeutics and as antigens in solid phase assays. They are also useful 

CC as reagents to identify molecules which effect the interaction of BBP and 

CC a cloned protein, that are useful in the treatment or prevention of 

CC diseases associated with apoptosis. The polynucleotides are useful for 

CC diagnostics. Note: In claim 5, the patent claims an amino acid sequence 

CC from figure 2. However, figure 2 does not contain any sequence. It is 

CC inferred from the disclosure that the figure 2 sequence refers to BBP1 

CC protein, shown in this sequence 

XX 

SQ Sequence 269 AA; 

Query Match 100.0%; Score 1439; DB 3; Length 269; 

Best Local Similarity 100.0%; Pred. No. 8e-141; 

Matches 269; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 

Qy 61 S KMAAAWPS GP SAPEAVTARLVGVLWFVS VTTGPWGAVAT SAGGEES LKCEDLKVGQ YI C 120 

I M I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 S KMAAAW P S GP SAP EAVT ARLVGVLW FVS VT T G PWGAVAT S AGGE E S L KC E DL KVGQ Y I C 12 0 

Qy 121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 18 0 

I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I II I 
Db 121 KDPKINDATQEPVNCTNYT7VHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 180 

Qy 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

Qy 241 SSYIIDYYGTRLTRLSITNETFRKTQLYP 269 

I II I I I II I I I I I I I I I I I I I II I I I I I I 
Db 241 SSYIIDYYGTRLTRLSITNETFRKTQLYP 269 



RESULT 3 
AAE33877 

ID AAE33877 standard; protein; 269 AA. 
XX 

AC AAE33877; 
XX 

DT 02-MAY-2003 (first entry) 
XX 

DE Human BBP-1 protein. 
XX 

KW Human; beta-amyloid peptide-binding protein; BAP; Abeta; betaAP; BBP; 

KW Alzheimer's disease; AD; transgenic; transgenic animal; gene therapy; 

KW neuroprotective; nootropic. 
XX 

OS Homo sapiens. 



XX 

PN WO200290499-A2. 
XX 

PD 14-NOV-2002 . 
XX 

PF 06-MAY-2002; 2002WO-US014223 . 
XX 

PR 09-MAY-2001; 2001US-00852100 . 
XX 

PA (AMHP ) WYETH. 
XX 

PI Ozenberger BA, Bard JA, Kajkowski EM, Jacobsen JS, Walker SG; 

PI Sofia H J , Howland DS; 

XX 

DR WPI; 2003-120537/11. 

DR N-PSDB; AAD51940. 
XX 

PT New human beta-amyloid peptide-binding protein, useful for diagnosing 

PT and/or treating diseases associated with aberrant expression of beta- 

PT amyloid peptide, e.g. Alzheimer's disease. 
XX 

PS Claim 4; Page 84-85; 85pp; English. 
XX 

CC The present invention relates to novel human beta-amyloid peptide (BAP; 

CC Abeta, betaAP) -binding (BBP) proteins and polynucleotides encoding such 

CC proteins. BBP sequences are useful to diagnose and/or treat diseases 

CC associated with aberrant expression of human BAP such as Alzheimer's 

CC disease (AD). They are used to generate transgenic animals. Sequences of 

CC the invention are also used in gene therapy. The present sequence is 

CC human BBP-1 protein 
XX 

SQ Sequence 2 69 AA; 

Query Match 100.0%; Score 1439; DB 6; Length 269; 

Best Local Similarity 100.0%; Pred. No. 8e-141; 

Matches 269; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 

Qy 61 S KMAAAW P S G P SAP EAVT ARLVGVLWEVS VT T G PWGAVAT S AGG E E S LKC ED LKVGQ Y I C 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 61 S KMAAAW P S G P SAP EAVT ARLVGVLW FVS VTT G PWGAVAT S AGGE E S LKC E D L KVGQ Y I C 12 0 

Qy 121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 180 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I 
Db 121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 18 0 

Qy 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I 
Db 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

Qy 241 SSYIIDYYGTRLTRLSITNETFRKTQLYP 269 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 SSYIIDYYGTRLTRLSITNETFRKTQLYP 269 



RESULT 4 
AAY12358 

ID AAY12358 standard; protein; 139 AA. 
XX 

AC AAY12358; 
XX 

DT 17-JUN-1999 (first entry) 
XX 

DE Human 5 1 EST secreted protein SEQ ID NO: 389. 
XX 

KW Human; secreted protein; EST; expressed sequence tag; diagnosis; 

KW forensic; gene therapy; chromosome mapping; signal peptide; 

KW upstream regulatory sequence; cytokine activity; cell proliferation; 

KW differentiation; haematopoiesis regulation; tissue growth regulation; 

KW reproductive hormone regulation; chemotactic; chemokinetic; haemostatic; 

KW thrombolytic; anti-inflammatory; tumour inhibition. 

XX 

OS Homo sapiens. 
XX 

PN WO9906548-A2. 
XX 

PD ll-FEB-1999. 
XX 

PF 31-JUL-1998; 98WO-IB001222 . 
XX 

PR 01-AUG-1997; 97US-00905135 . 
XX 

PA (GEST ) GENSET. 
XX 

PI Dumas Milne Edwards J, Duclert A, Lacroix B; 
XX 

DR WPI; 1999-153778/13. 

DR N-PSDB; AAX41191. 
XX 

PT New nucleic acids encoding human secreted proteins - obtained from cDNA 

PT libraries prepared from e.g. liver, ovary, brain, prostate, kidney, lung, 

PT umbilical cord, placenta and colon tissue. 
XX 

PS Claim 27; Page 714-715; 824pp; English. 
XX 

CC AAX41094 to AAX41347 represent 5' expressed sequence tags (ESTs) for 

CC human secreted proteins, and encode the proteins given in AAY12261 to 

CC AAY12514, respectively. The proteins given represent the signal peptide 

CC and an N-terminal fragment of a secreted protein. The nucleic acid 

CC sequences can be used for producing secreted human gene products. They 

CC can also be used to develop products for diagnosis and therapy. The 

CC proteins obtained may have cytokine activity, cell 

CC proliferation/differentiation activity, haematopoiesis regulating 

CC activity, tissue growth regulating activity, reproductive hormone 

CC regulating activity, chemotactic/ chemokinetic activity, haemostatic and 

CC thrombolytic activity, receptor/ ligand activity, anti-inflammatory 

CC activity, tumour inhibition activity or other activities . The products 

CC can be used in forensic, gene therapy and chromosome mapping procedures. 

CC The sequences can also be used for obtaining corresponding promoter 

CC sequences. The nucleic acids encoding the signal peptide can be used for 

CC directing extracellular secretion of a polypeptide or the insertion of a 



CC polypeptide into a membrane, or importing a polypeptide into a cell 
XX 

SQ Sequence 139 AA; 

Query Match 52.0%; Score 74 8; DB 2; Length 139; 

Best Local Similarity 99.3%; Pred. No. 2.2e-69; 

Matches 138; Conservative ' 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


63 MAAAW P S G P SAP EAVTARLVG VLW FVS VT T G PWGAVAT S AGGE E S L KC E D L KVGQ Y I C KD 122 




1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


1 MAAAWX S G P SAP EAVTARLVG VLW FVS VTT G PWGAVAT S AGGE E S L KC E D L KVGQ Y I C KD 60 


Qy 


123 PKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYS 182 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


61 PKINDATQEPVNCTN YTAHVS CFPAPNI TCKDS S GNETHFTGNEVGFFKP I S CRNVNGYS 120 


Qy 


183 YKVAVALSLFLGWLGADRF 201 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


121 YKVAVALSLFLGWLGADRF 139 


RESULT 5 


AAY36021 


ID 


AAY36021 standard; protein; 162 AA. 


XX 




AC 


AAY36021; 


XX 




DT 


Io-sep— lyyy (rirst entry; 


XX 




DE 


Extended human secreted protein sequence, SEQ ID NO. 406. 


XX 




KW 


Secreted protein; human; cytokine; cellular proliferation; cell movement; 


KW 


cellular differentiation; immune system regulator; anti-inflammatory; 


KW 


haematopoiesis regulator; tissue growth regulator; tumour inhibitor; 


KW 


reproductive hormone regulator; chemotaxis; chemokinesis ; gene therapy; 


KW 


genetic disease. 


XX 




OS 


Homo sapiens . 


XX 




PN 


W09931236-A2. 


XX 




PD 


24-JUN-1999. 


XX 




PF 


17-DEC-1998; 98WO-IB002122 . 


XX 




PR 


17-DEC-1997; 97US-0069957P . 


PR 


09-FEB-1998; 98US-0074121P . 


PR 


13-APR-1998; 98US-0081563P . 


PR 


10-AUG-1998; 98US-0096116P . 


XX 




PA 


(GEST ) GENSET. 


XX 




PI 


Bougueleret L, Duclert A, Dumas Milne Edwards J; 


XX 




DR 


WPI; 1999-385906/32. 


DR 


N-PSDB; AAX97705. 


XX 





PT New isolated human secreted proteins. 
XX 

PS Claim 9; Page 346-347; 516pp; English. 
XX 

CC This sequence is encoded by an extended human secreted protein coding 

CC sequence of the invention. The secreted proteins can be used in treating 

CC or controlling a variety of human conditions. The secreted proteins may 

CC act as cytokines or may affect cellular proliferation or differentiation 

CC or may act as immune system regulators, haematopoiesis regulators, tissue 

CC growth regulators, regulators of reproductive hormones or cell movement 

CC or have chemotactic/chemokinetic, receptor/ligand, anti-inflammatory or 

CC tumour inhibition activity. The DNAs can be used in forensic procedures 

CC to identify individuals or in diagnostic procedures to identify 

CC individuals having genetic diseases resulting from abnormal expression of 

CC the genes corresponding to the extended cDNAs . They are also useful for 

CC constructing a high resolution map of the human chromosomes. They can 

CC also be used for gene therapy to control or treat genetic diseases 
XX 

SQ Sequence 162 AA; 

Query Match 4 6.8%; Score 673.5; DB 2; Length 162; 

Best Local Similarity 84.1%; Pred. No. 1.6e-61; 

Matches 127; Conservative 4; Mismatches 17; Indels 3; Gaps 2 

Qy 63 MAAAW P S G P SAP EAVTARLVGVLW FVS VT T G PWGAVAT S AGGE E S L KCEDL KVGQ YI C KD 122 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAAAWPSGPXAPEAWARLVGVXWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYICKD 60 

Qy 123 PKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYS 182 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I II I II I I I I I I I I I I I I I I I I I I I I I 
Db 61 PKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYS 120 

Qy 183 YKVAVALSLFLGWLGADRFYLGY-PALGLLK 212 

I : I I : I I : I I : I 

Db 121 YNEQSHVS — FSWMVGSRSILPWIPCFGFVK 149 



RESULT 6 
AAY12426 

ID AAY12426 standard; protein; 148 AA. 
XX 

AC AAY12426; 
XX 

DT 17-JUN-1999 (first entry) 
XX 

DE Human 5' EST secreted protein SEQ ID NO: 457. 
XX 

KW Human; secreted protein; EST; expressed sequence tag; diagnosis; 

KW forensic; gene therapy; chromosome mapping; signal peptide; 

KW upstream regulatory sequence; cytokine activity; cell proliferation; 

KW differentiation; haematopoiesis regulation; tissue growth regulation; 

KW reproductive hormone regulation; chemotactic; chemokinetic; haemostatic; 

KW thrombolytic; anti-inflammatory; tumour inhibition. 

XX 

OS Homo sapiens. 
XX 

PN WO9906548-A2. 



XX 

PD ll-FEB-1999- 
XX 

PF 31-JUL-1998; 98WO-IB001222 . 
XX 

PR 01-AUG-1997; 97US-00905135 . 
XX 

PA (GEST ) GENSET. 
XX 

PI Dumas Milne Edwards J, Duclert A, Lacroix B; 
XX 

DR WPI; 1999-153778/13. 

DR N-PSDB; AAX41259. 
XX 

PT New nucleic acids encoding human secreted proteins - obtained from cDNA 

PT libraries prepared from e.g. liver, ovary, brain, prostate, kidney, lung, 

PT umbilical cord, placenta and colon tissue. 
XX 

PS Claim 27; Page 763-764; 824pp; English. 
XX 

CC AAX41094 to AAX41347 represent 5' expressed sequence tags (ESTs) for 

CC human secreted proteins, and encode the proteins given in AAY12261 to 

CC AAY12514, respectively. The proteins given represent the signal peptide 

CC and an N-terminal fragment of a secreted protein. The nucleic acid 

CC sequences can be used for producing secreted human gene products. They 

CC can also be used to develop products for diagnosis and therapy. The 

CC proteins obtained may have cytokine activity, cell 

CC proliferation/differentiation activity, haematopoiesis regulating 

CC activity, tissue growth regulating activity, reproductive hormone 

CC regulating activity, chemo tactic/ chemokinetic activity, haemostatic and 

CC thrombolytic activity, receptor/ ligand activity, anti-inflammatory 

CC activity, tumour inhibition activity or other activities . The products 

CC can be used in forensic, gene therapy and chromosome mapping procedures. 

CC The sequences can also be used for obtaining corresponding promoter 

CC sequences. The nucleic acids encoding the signal peptide can be used for 

CC directing extracellular secretion of a polypeptide or the insertion of a 

CC polypeptide into a membrane, or importing a polypeptide into a cell 

XX 

SQ Sequence 148 AA; 

Query Match 46.4%; Score 667.5; DB 2; Length 148; 
Best Local Similarity 83.3%; Pred. No. 5.8e-61; 

Matches 125; Conservative 4; Mismatches 18; Indels 3; Gaps 5 



Qy 



63 MAAAW P S G P SAP EAVT ARL VGVLW FVS VTT GPWGAVAT S AGG EES L KC E DLKVGQ Y I C KD 



122 




Db 



1 MAAAWPSGPXAPEAW7VRLVGVXWFVSWTGPWGAVATSAGGEESLKCEDLKVGQYICKD 



60 



Qy 



123 PKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYS 



182 





Db 



61 PKINDATQEPVNCTNYTAHVSCFPAPNITCKDXSGNETHFTGNEVGFFKPISCRNVNGYS 



120 



Qy 



183 YKVAVALSLFLGWLGADRFYLGY-PALGLL 211 



Db 



I * i i * i i • i i • 

121 YXXQXXVS — FSWMVGSRSILPWIPCFGFV 148 



RESULT 7 
ADB91834 

ID ADB91834 standard; protein; 81 AA. 
XX 

AC ADB91834; 
XX 

DT 04-DEC-2003 (first entry) 
XX 

DE Human secreted protein #SEQ ID 780. 
XX 

KW Secreted protein; gene therapy; antidiabetic; diabetes; human. 
XX 

OS Homo sapiens. 
XX 

PN WO2003004622-A2. 
XX 

PD 16-JAN-2003. 
XX 

PF 19-MAR-2002; 2002WO-US008124 . 
XX 

PR 21-MAR-2001; 2 00 1US- 02 7 7 34 OP . 

PR 19-JUL-2001; 2 001US-030617 IP . 

PR 13-NOV-2001; 2001US-03312 87P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Ruben SM; 
XX 

DR WPI; 2003-229407/22. 
XX 

PT Nucleic acid encoding a human secreted protein is useful in diagnosing or 

PT treating diabetes or conditions related to diabetes . 

XX 

PS Claim 3; SEQ ID NO 780; 1537pp; English. 
XX 

CC The invention relates to isolated nucleic acid molecules ADB91065- 

CC ADB91448 and ADB91835-ADB91911 encoding human secreted proteins ADB91449- 

CC ADB91834. Also disclosed is a recombinant vector comprising a 

CC polynucleotide of the invention, and a recombinant host cell comprising 

CC the recombinant vector. The polypeptide of the invention is useful in 

CC identifying a binding partner by contacting the polypeptide with a 

CC binding partner, and determining whether the binding partner increases or 

CC decreases activity of the polypeptide. The polypeptide, polynucleotide, 

CC antibody or its fragment, agonist or antagonist are useful for preparing 

CC a pharmaceutical composition for diagnosing or treating diabetes or 

CC conditions related to diabetes. The present sequence is that of the human 

CC immunoglobulin Fc portion used to generate fusion proteins, increasing 

CC the stability of the fused protein as compared to the secreted protein 

CC only. Note: The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences. 

XX 

SQ Sequence 81 AA; 



Query Match 29.8%; Score 429; DB 7; Length 81; 

Best Local Similarity 100.0%; Pred. No. 1.5e-36; 

Matches 81; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 



164 GNEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGS 223 




Db 



1 GNEVGFFKPISCRNWGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGS 60 



Qy 



Db 



224 LIDFILISMQIVGPSDGSSYI 244 

I I I I I I I I I I I I I I I I I I I I I 

61 LIDFILISMQIVGPSDGSSYI 81 



RESULT 8 
ABB65236 

ID ABB65236 standard; protein; 178 AA. 
XX 

AC ABB65236; 
XX 

DT 26-MAR-2002 (first entry) 
XX 

DE Drosophila melanogaster polypeptide SEQ ID NO 22500. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 

KW pharmaceutical. 

XX 

OS Drosophila melanogaster. 
XX 

PN WO200171042-A2. 
XX 

PD 27-SEP-2001. 
XX 

PF 23-MAR-2001; 2001WO-US009231 . 
XX 

PR 23-MAR-2000; 2000US-0191637P . 

PR ll-JUL-2000; 2000US-00614150. 
XX 

PA { PEKE ) PE CORP NY. 
XX 

PI Venter JC, Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 

DR N-PSDB; ABL09339. 
XX 

PT New isolated nucleic acid detection reagent for detecting 1000 or more 

PT genes from Drosophila and for elucidating cell signaling and cell-cell 

PT interactions. 
XX 

PS Disclosure; SEQ ID NO 22500; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 

CC capable of detecting 1000 or more genes from Drosophila. The invention is 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell-cell interactions in higher eukaryotes for the development of 

CC insecticides, therapeutics and pharmaceutical drugs. The invention 

CC discloses genomic DNA sequences (ABL16176-ABL30511 ) , expressed DNA 

CC sequences (ABL01840-ABL16175 ) and the encoded proteins (ABB57737- 

CC ABB72072) . The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences 



XX 

SQ Sequence 178 AA; 



Query Match 23.5%; Score 338; DB 4; Length 178; 

Best Local Similarity 42.6%; Pred. No. 1.4e-26; 

Matches 69; Conservative 30; Mismatches 49; Indels 14; Gaps 5 

Qy 107 SLKCEDLK-VGQYICKDP KINDATQEPVNCTNY-TAHVSCFPAPNITCKDSSGNETH 161 

: : I : I : : I I : : I I I : I : I I : II I I I I I I : : I I I 

Db 20 NVDCNELQMMGQFMCPDPARGQIDPKTQQLAGCTREGRARVWCIAANEINCTE-TGNAT- 77 

Qy 162 FTGNEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGI 221 

I : : I : III : I I : I I I I I I I I I I I I : I I I I I I I : I : 

Db 78 FTREVPCKWTNGYHLDTTLLLSVFLGMFGVDRFYLGYPGIGLLKFCTLGGMFL 130 

Qy 222 GSLIDFILISMQIVGPSDGSSYIIDYYGTRLTRLSITNETFR 263 

I I I I : I I : : I : I I I : I I I : I : I I I I : ' I hi 

Db 131 GQLI DI VLI ALQWGPADGSAYVI P YYGAGI HI VRS DNTT YR 172 



RESULT 9 






AAU97631 






ID 


AAU97631 standard; protein; 100 AA. 






XX 








AC 


AAU9 / 631 ; 






XX 








Ui 


io~AUb ^uuz iiirsu entry) 






XX 








DE 


RNA polymerase II subunit 11 protein. 






VY 
AA 








KW 


RNA polymerase II subunit 11; cancer; 


HIV; infection; 




KW 


human immunodeficiency virus. 






XX 








OS 


Unidentified. 






XX 








PN 


CN1331300-A. 






XX 








PD 


16-JAN-2002. 






XX 








PF 


30-JUN-2000; 2000CN-00116963 . 






XX 








PR 


30-JUN-2000; 2000CN-00116963 . 






XX 








PA 


(BODE-) BODE GENE DEV CO LTD SHANGHAI. 






XX 








PI 


Mao Y, Xie Y; 






XX 








DR 


WPI; 2002-340664/38. 






DR 


N-PSDB; ABK52558. 






XX 








PT 


Polypeptide-RNA polymerase II subunit 


11 and polynucleotide for 


coding 


PT 


it. 






XX 








PS 


Claim 1; Page 29; 32pp; Chinese. 






XX 








cc 


This invention relates to the DNA and 


protein sequences of a novel 


cc 


polypeptide-RNA polymerase II subunit 


11 protein. The invention 


also 



CC comprises a process for preparing the polypeptide of the invention by DNA 

CC recombination, the application of the polypeptide in treating diseases 

CC such as cancer, human immunodeficiency virus (HIV) infection, etc, the 

CC antagonist of the polypeptide and its medical action, and the application 

CC of the said polynucleotide are disclosed. The present sequence represents 

CC the RNA polymerase II subunit 11 protein of the invention 
XX 

SQ Sequence 100 AA; 

Query Match 20.4%; Score 293; DB 5; Length IOC- 

Best Local Similarity 98.2%; Pred. No. 2.9e-22; 

Matches 55; Conservative 1; Mismatches 0; Indels 0; Gaps 0; 

Qy 63 MA7UVWP S GPS APEAVTARLVGVTjWFVS WTGPWGAVAT S AGGEES LKCEDLKVGQY 118 

I | I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1 MAAAWPSGPSAPDAVTARLVGVIjWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQY 56 



RESULT 10 


ADA57043 


ID 


ADA57043 standard; protein; 221 AA. 


XX 




AC 


ADA57043; 


XX 




DT 


20-NOV-2003 (first entry) 


XX 




DE 


Human secreted protein #32 6. 


XX 




KW 


immunosuppressive; antiinflammatory; antiasthmatic; antiallergic; 


KW 


cytostatic; cerebroprotective ; neuroprotective; nootropic; 


KW 


cardiovascular; antiarterioscieroLiC/ gene unexapy, 


KW 


human secreted protein; immune disorder; inflammation; 


KW 


respiratory disorder; cancer; CNS disorder; neurodegenerative disorders; 


KW 


inflammatory bowel disease; nephritis; Crohn's disease; asthma; allergy; 


KW 


multiple sclerosis; ischaemic brain injury; Parkinson's disease; 


KW 


Alzheimer's disease; atherosclerosis; myocarditis; chromosome mapping; 


KW 


triple helix formation; antisense gene therapy; forensic biology. 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO2002102994-A2. 


XX 




PD 


27-DEC-2002. 


XX 




PF 


19-MAR-2002; 2002WO-US00827 8 . 


XX 




PR 


21-MAR-2001; 2001US-0277340P . 


PR 


19-JUL-2001; 2001US-0306171P . 


PR 


13-NOV-2001; 2001US-0331287P . 


XX 




PA- 


(HUMA-) HUMAN GENOME SCI INC. 


XX 




PI 


Rosen CA, Ruben SM; 


XX 




DR 


WPI; 2003-167512/16. 


DR 


N-PSDB; ADA56147. 


XX 





PT New human secreted polypeptides and polynucleotides, useful for 

PT diagnosing, treating or preventing e.g. immune disorders, inflammatory 

PT conditions, respiratory disorders, cancers, CNS disorders, or 

PT neurodegenerative disorders. 

XX 

PS Claim 13; SEQ ID NO 1233; 1754pp; English. 
XX 

CC The invention relates to 592 new human secreted polypeptides useful for 

CC diagnosing, treating or preventing e.g. immune disorders, inflammatory 

CC conditions, respiratory disorders, cancers, CNS disorders, or 

CC neurodegenerative disorders, or polypeptides comprising an amino acid 

CC sequence at least 95% identical to the new sequences. The polypeptides, 

CC antibodies or antibody fragments that bind to the polypeptides, nucleic 

CC acids encoding the polypeptides, agonists or antagonists that binds to 

CC the polypeptide, are useful in preparing diagnostic or pharmaceutical 

CC compositions for diagnosing, treating or preventing an e.g. immune 

CC disorders, inflammatory conditions (e.g. inflammatory bowel disease, 

CC nephritis or Crohn's disease), respiratory disorders (e.g. asthma and 

CC allergy), cancers (e.g. gastric, ovarian or lung cancer), CNS disorders 

CC (e.g. multiple sclerosis or ischaemic brain injury), neurodegenerative 

CC disorders (e.g. Parkinson's disease or Alzheimer's disease), and 

CC cardiovascular disorders (e.g. atherosclerosis or myocarditis) . The 

CC polynucleotides are useful for chromosome identification, chromosome 

CC mapping, for controlling gene expression through triple helix formation 

CC or antisense DNA or RNA, in gene therapy, for identifying individuals 

CC from minute biological samples, in forensic biology, and as hybridization 

CC probes. The polypeptides are useful for as molecular weight markers on 

CC sodium dodecyl sulf ate-polyacrylamide gel electrophoresis (SDS-PAGE) 

CC gels, to raise antibodies, for testing biological activities, and for 

CC treating or preventing neural disorders, immune system disorders, 

CC muscular, reproductive, gastrointestinal, pulmonary, cardiovascular, 

CC renal, proliferative and/or cancerous diseases. This sequence corresponds 

CC to one of the polypeptide of the invention. Note: The sequence data for 

CC this patent did form part of the printed specification, but was obtained 

CC in electronic format directly from WIPO at 

CC ftp.wipo . int/pub/published_pct_sequences . 

XX 

SQ Sequence 221 AA; 

Query Match 14.0%; Score 201; DB 6; Length 221; 

Best Local Similarity 45.7%; Pred. No. 3.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 



Qy 135 CTNYTA— HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

Mil: Ml : | | I I : I I II I I : I II : Ml 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

||:|| I II II I II III : I I I : I II M I : I II : II I I I 

Db 166 LS ITLGGFGADRFYLGQWXEGLGKLFS FGGLGIWTLI DVLLI GVGYVGPADGS LYI 221 



RESULT 11 
ABO14063 

ID ABO14063 standard; protein; 221 AA. 
XX 

AC ABO14063; 



XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
XX 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
PA 
XX 
PI 
PI 
PI 
XX 



21-AUG-2003 (first entry) 

Novel human secreted protein #92. 

Human; secreted protein; cytostatic; neuroprotective; hepatotropic; 
gene therapy; cancer; liver disorder; hepatitis; neural disorder; 
Alzheimer's disease. 

Homo sapiens. 

US2003028003-A1. 



06 


-FEB- 


2003 






12- 


-OCT- 


2001; 2001US- 


00974879 


07 


-NOV- 


1997, 


97US- 


0064900P 


07 


-NOV- 


1997, 


97US- 


0064908P 


07 


-NOV- 


1997, 


97US- 


0064911P 


07 


-NOV- 


1997, 


97US- 


0064912P 


07 


-NOV- 


1997, 


r 97US- 


0064983P 


07 


-NOV- 


1997, 


; 97US- 


0064984P 


07 


-NOV- 


1997, 


? 97US- 


0064985P 


07 


-NOV- 


1997, 


r 97US- 


0064987P 


07 


-NOV- 


1997, 


? 97US- 


0064988P 


17 


-NOV- 


1997, 


; 97US- 


0066089P 


17 


-NOV- 


1997, 


; 97US- 


0066090P 


17 


-NOV- 


1997, 


; 97US- 


0066094P 


17 


-NOV- 


1997, 


; 97US- 


0066095P 


17 


-NOV- 


1997 


? 97US- 


0066100P 


04 


-NOV- 


1998 


; 98WO- 


US023435 


05 


-MAY- 


1999 


? 99US- 


00305736 


13 


-OCT- 


2000 


? 2000US- 


0239893P 


28 


-MAR- 


2001 


? 2001US- 


00818683 



(ROSE/ 
( FENG/ 
(RUBE/ 
(EBNE/ 
(OLSE/ 
(NIJJ/ 
(WEIY/ 
(SOPP/ 
(MOOR/ 
( KYAW/ 
(LAFL/ 
(SHIY/ 
( JANA/ 
(ENDR/ 
(CART/ 
(BIRS/ 



A. 



M. 



ROSEN C 
FENG P. 
RUBEN S 
EBNER R. 
OLSEN H S. 
NI J. 
WEI Y. 
SOPPET D R. 
MOORE P A. 
KYAW H. 
LAFLEUR 
SHI Y. 
JANAT F. 
ENDRESS 
CARTER K C. 
BIRSE C E. 



D W. 



G A. 



Rosen CA, Feng P, Ruben SM, Ebner R, Olsen HS, Ni J, Wei Y; 
Soppet DR, Moore PA, Kyaw H, Lafleur DW, Shi Y, Janat F; 
Endress GA, Carter KC, Birse CE; 



DR WPI; 2003-479549/45. 

DR N-PSDB; ACD18950. 
XX 

PT New nucleic acid molecule, useful for preparing a medicament for 

PT preventing, treating or ameliorating a medical condition e.g., cancer, 

PT liver disorders such as hepatitis or neural disorders such as Alzheimer's 

PT disease. 

XX 

PS Claim 11; Page 387-388; 496pp; English. 
XX 

CC The invention describes a new isolated nucleic acid molecule comprising a 

CC sequence having at least 95% identity with a sequence comprising: (a) a 

CC polynucleotide (PN) fragment of a sequence comprising 420-3435 bp, or its 

CC allelic variant; (b) a PN fragment of the cDNA sequence; (c) a PN 

CC sequence encoding a polypeptide, or its fragment, domain, epitope or 

CC species homologue; or (d) a PN that hybridises under stringent conditions 

CC to any one of the sequences of (A) -(C). The nucleic acid is useful for 

CC preparing a medicament for preventing, treating or ameliorating a medical 

CC condition e.g., cancer, liver disorders such as hepatitis or neural 

CC disorders such as Alzheimer's disease. This is the amino acid sequence of 

CC a novel human secreted protein 

XX 

SQ Sequence 221 AA; 



Query Match 14.0%; Score 201; DB 6; Length 221; 

Best Local Similarity 45.7%; Pred. No. 3.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTN YTA — HVS C FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

III I : III : I I I I : I I II I I : I I I : 1 = 1 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I I I I II III : I I I : I I I : I I : I I I : I I I I I 

Db 166 LSITLGGFGADRFYLGQWXEGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 12 


ABR47818 


ID 


ABR47818 standard; protein; 221 AA. 


XX 




AC 


ABR47818; 


XX 




DT 


12-JUN-2003 (first entry) 


XX 




DE 


Human secreted protein, SEQ ID 709. 


XX 




KW 


Cardiant; antiarrhythmic; antiarteriosclerotic; vasotropic; cytostatic; 


KW 


vulnerary; antiinflammatory; nootropic; neuroprotective; 


KW 


antiparkinsonian; gene therapy; human; cardiovascular disorder. 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO200295010-A2. 


XX 




PD 


2 8-NOV-2002. 


XX 





PF 19-MAR-2002; 2002WO-US009785 . 
XX 

PR 21-MAR-2001; 2001US-0277340P . 

PR 19-JUL-2001; 2001US-0306171P . 

PR 13-NOV-2 001; 2001US-0331287P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Ruben SM; 
XX 

DR WPI; 2003-129429/12. 
XX 

PT Novel human secreted proteins, useful for detecting, preventing, 

PT diagnosing, prognosticating, treating and/or ameliorating cardiovascular 

PT disorders such as arrhythmia. 

XX 

PS Claim 13; SEQ ID NO 709; 1881pp; English. 
XX 

CC The present invention relates to novel human secreted proteins (ABR47633- 

CC ABR48145) and their coding sequences (ACC50344-ACC50856) . The proteins 

CC and their coding sequences are useful for the preparation of a diagnostic 

CC or pharmaceutical composition for diagnosing or treating a cardiovascular 

CC disorder (e.g., arrhythmia, tachycardia, cardiac arrest, coronary 

CC arteriosclerosis and myocardial ischaemia) , neural disorders, immune 

CC system disorders, muscular disorders, reproductive disorders, 

CC gastrointestinal disorders, pulmonary disorders, renal disorders, 

CC proliferative disorders and/or cancerous diseases and conditions, for 

CC wound healing and epithelial cell proliferation, to treat inflammation or 

CC infection, for treating thrombosis and arteriosclerosis, for treating or 

CC preventing neural damage which occurs in neuronal disorders or 

CC neurodegenerative conditions such as Alzheimer's disease and Parkinson's 

CC disease, to enhance bone and periodontal regeneration and aid in tissue 

CC transplants or bone grafts, to prevent skin aging or hair loss, to 

CC stimulate growth and differentiation of haematopoietic cells and bone 

CC marrow cells when used in combination with other cytokines, to maintain 

CC organs before transplantation or for supporting cell culture of primary 

CC tissues, to increase or decrease differentiation or proliferation of 

CC embryonic stem cells, or to modulate mammalian characteristics or 

CC metabolism. Note: The sequence data for this patent was published in 

CC electronic format and is available from WIPO at 

CC f tp . wipo. int/pub/published_pct_sequences 

XX 

SQ Sequence 221 AA; 



Query Match 14.0%; Score 201; DB 6; Length 221; 

Best Local Similarity 45.7%; Pred. No. 3.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 18 8 

III I : II I : M I I : I I II I I : I I I : hi 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I I I I I I II I : I I I : I I I : I I : I I I : I I I I I 

Db 166 LSITLGGFGADRFYLGQWXEGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 13 
ABR00112 

ID ABR00112 standard; protein; 221 AA. 
XX 

AC ABR00112; 
XX 

DT 03-APR-2003 (first entry) 
XX 

DE Human gene 102 encoded secreted protein HMEED18, SEQ ID NO: 401. 
XX 

KW Human; secreted protein; digestive disorder; gastrointestinal disorder; 

KW mouth; oesophagus; stomach; small intestine; large intestine; liver; 

KW biliary tract; pancreas; cancer; tumour; hyperprolif erative disorder; 

KW immune disorder; inflammation; infection; wound healing; drug screening; 

KW chromosome identification; chromosome mapping; cytostatic; 

KW antiinflammatory; immunosuppressive; vulnerary; gene therapy. 
XX 

OS Homo sapiens . 
XX 

PN WO200276488-A1. 
XX 

PD 03-OCT-2002. 
XX 

PF 19-MAR-2002; 2002WO-US008276 . 
XX 

PR 21-MAR-2001; 2001US-0277340P . 

PR 19-JUL-2001; 2001US-030617 IP . 

PR 13-NOV-2001; 2001US-0331287P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Ruben SM; 
XX 

DR WPI; 2003-029900/02. 

DR N-PSDB; ABZ71291. 
XX 

PT New human secreted proteins and nucleic acids, useful for detecting, 

PT preventing, diagnosing, prognosticating, treating and/or ameliorating 

PT e.g. gastrointestinal diseases and disorders, or cancers. 
XX 

PS Claim 13; Page 1007; 1216pp; English. 
XX 

CC ABZ71190-ABZ71478 represent cDNAs corresponding to 178 human secreted 

CC protein genes, and ABP00011-ABP00299 represent the proteins they encode. 

CC ABZ71479-ABZ71540 represent human secreted protein genomic fragments. The 

CC invention also encompasses antibodies specific for the secreted proteins, 

CC the use of the secreted proteins in drug screening, and recombinant 

CC vectors and host cells comprising a nucleic acid of the invention. The 

CC secreted proteins, nucleic acids encoding them, antibodies or antibody 

CC fragments specific for the secreted proteins, and modulators of protein 

CC activity are useful for diagnosing, treating, ameliorating or preventing 

CC digestive disorders. Such conditions include disorders of the mouth, 

CC oesophagus, stomach, small intestine, large intestine, liver, biliary 

CC tract and pancreas, and include cancers of these organs and tissues. The 

CC secreted proteins and their nucleic acids may also be used in the 

CC treatment of immune disorders, inflammation, infection, 

CC hyperprolif erative disorders, and to promote wound healing. Nucleic acids 



CC of the invention may be used for chromosome identification, chromosome 

CC mapping, in gene therapy, for identifying individuals from minute 

CC biological samples, as hybridisation probes, and as molecular weight 

CC markers. The present sequence represents a human secreted protein of the 

CC invention 

XX 

SQ Sequence 221 AA; 

Query Match 14.0%; Score 201; DB 6; Length 221; 

Best Local Similarity 45.7%; Pred. No. 3.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

III I : III : I I I I : I I II I I : I I I : I = I 

Db 112 CTN ST S CMT VS C P RQRYP A-NCT VRD HVHCLGNRT-FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I I I I I I III : I II : I I I : I I : I I I : I I I I I 

Db 166 LSITLGGFGADRFYLGQWXEGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 14 


ADB91589 


ID 


ADB91589 standard; protein; 221 AA. 


XX 




AC 


ADB91589 ; 


XX 




DT 


04-DEC-2003 (rirst entry) 


XX 




DE 


Human secreted protein #SEQ ID 535. 


XX 




KW 


Secreted protein; gene therapy; antidiabetic; diabetes; human. 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO2003004622-A2. 


XX 




PD 


16-JAN-2003. 


XX 




PF 


19-MAR-2002; 2002WO-US008124 . 


XX 




PR 


21-MAR-2001; 2001US-0277340P . 


PR 


19-JUL-2001; 2001US-0306171P . 


PR 


13-NOV-2001; 2001US-0331287P . 


XX 




PA 


(HUMA-) HUMAN GENOME SCI INC. 


XX 




PI 


Rosen CA, Ruben SM; 


XX 




DR 


WPI; 2003-229407/22. 


XX 




PT 


Nucleic acid encoding a human secreted protein is useful in diagnosing or 


PT 


treating diabetes or conditions related to diabetes. 


XX 




PS 


Claim 3; SEQ ID NO 535; 1537pp; English. 


XX 




CC 


The invention relates to isolated nucleic acid molecules ADB91065- 



CC ADB91448 and ADB91835-ADB91911 encoding human secreted proteins ADB91449- 

CC ADB91834. Also disclosed is a recombinant vector comprising a 

CC polynucleotide of the invention, and a recombinant host cell comprising 

CC the recombinant vector. The polypeptide of the invention is useful in 

CC identifying a binding partner by contacting the polypeptide with a 

CC binding partner, and determining whether the binding partner increases or 

CC decreases activity of the polypeptide. The polypeptide, polynucleotide, 

CC antibody or its fragment, agonist or antagonist are useful for preparing 

CC a pharmaceutical composition for diagnosing or treating diabetes or 

CC conditions related to diabetes. The present sequence is that of the human 

CC immunoglobulin Fc portion used to generate fusion proteins, increasing 

CC the stability of the fused protein as compared to the secreted protein 

CC only. Note: The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences. 

XX 

SQ Sequence 221 AA; 

Query Match 14.0%; Score 201; DB 7; Length 221; 

Best Local Similarity 45.7%; Pred. No. 3.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA— HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 18 8 

III I : II I : I I I I : I I II I I : I I I : I : I 

Db 112 CTN ST S CMT VS C P RQRY PA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I I I I I I III : I I I : I I I : I I : I I I : I I I I I 

Db 166 LSITLGGFGADRFYLGQWXEGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 15 
ADC74204 

ID ADC74204 standard; protein; 221 AA. 
XX 

AC ADC74204; 
XX 

DT 01-JAN-2004 (first entry) 
XX 

DE Human secreted protein - SEQ ID 837. 
XX 

KW antianaemic; antirheumatic; antiarthritic; antiinflammatory; antithyroid; 

KW antidiabetic; immunosuppressive; dermatological ; nephrotropic; 

KW antiparkinsonian; neuroprotective; nootropic; antibacterial; virucide; 

KW fungicide; antiparasitic; antiarteriosclerotic; vulnerary; cytostatic; 

KW haemopoietic; haematologic; anaemia; autoimmune disorder; 

KW rheumatoid arthritis; inflammation; Grave 1 s disease; diabetes; 

KW systemic lupus erythematosus; glomerulonephritis; neurodegenerative; 

KW Parkinson 1 s; Alzheimer's; wound; hyperprolif erative; atherosclerosis; 

KW cancer; bacterial; viral; fungal; parasitic infection; gene therapy; 

KW human . 

XX 

OS Homo sapiens. 
XX 

PN WO2003038063-A2. 
XX 

PD 08-MAY-2003. 



XX 

PF 19-MAR-2002; 2002WO-US008277 . 
XX 

PR 21-MAR-2001; 2001US-0277340P . 

PR 19-JUL-2001; 2001US-0306171P . 

PR 13-NOV-2 001; 2001US-0331287P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Ruben SM; 
XX 

DR WPI; 2003-430516/40. 

DR N-PSDB; ADC73589. 
XX 

PT New human secreted polypeptide for diagnosing, preventing or treating 

PT hematopoietic or hematologic disorders (e.g. anemia), autoimmune 

PT disorders (e.g. diabetes) or hyperprolif erative disorders (e.g. cancer or 

PT atherosclerosis) . 

XX 

PS Claim 16; SEQ ID NO 837; 2272pp; English. 
XX 

CC The invention relates to a novel human secreted polypeptide comprising a 

CC defined sequence given in the specification. The polypeptide, nucleic 

CC acid molecule, antibody, agonist or antagonist of the invention may be 

CC useful for preparing a composition for diagnosing or treating a 

CC haemopoietic or haematologic disorder such as anaemia, autoimmune 

CC disorders such as rheumatoid arthritis, inflammation, Grave's disease, 

CC diabetes, systemic lupus erythematosus or glomerulonephritis, 

CC neurodegenerative disorders including Parkinson's disease and Alzheimer's 

CC disease, wounds and hyperprolif erative disorders including 

CC atherosclerosis or cancer, as well as bacterial, viral, fungal or 

CC parasitic infections. The polypeptide may also be used during gene 

CC therapy procedures and for identifying a binding partner by contacting 

CC the polypeptide with a binding partner and determining whether the 

CC binding partner increases or decreases the activity of the polypeptide. 

CC The current sequence is that of the human secreted protein of the 

CC invention. 

XX 

SQ Sequence 221 AA; 

Query Match 14.0%; Score 201; DB 7; Length 221; 

Best Local Similarity 45.7%; Pred. No. 3.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 



Qy 



Db 



135 CTNYTA— HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

IMI: III : I I I I : I I II I I : I I I : I : I 

112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 



Qy 



189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 



Db 



i i • i i i i i i I i i i ii i • i ii *iii * i * • lit- 

166 LSITLGGFGADRFYLGQWXEGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



Search completed: March 4, 2004, 10:24:13 
Job time : 99 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



March 4, 2004, 10:22:25 ; Search time 43 Seconds 

(without alignments) 
322.962 Million cell updates/sec 

US-09-852-100B-2 
1439 

1 MHILKGSPNVIPRAHGQKNT TRLTRLSITNETFRKTQLYP 269 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



389414 



Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1: /cgn2_6/ptodata/2/iaa/5A_COMB.pep:* 

2 : /cgn2_6/ptodata/2/iaa/5B_COMB.pep : * 

3: /cgn2_6/ptodata/2/iaa/6A_COMB.pep: * 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB.pep : * 

5 : /cgn2_6/ptodata/2/iaa/PCTUS_COMB . pep : * 

6: /cgn2_6/ptodata/2/iaa/backf ilesl .pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 
No. 


Score 


Query 
Match 


Length 


DB 


ID 








Description 


1 


85.5 


5.9 


1023 


4 


us- 


10- 


164 


-595-20 


Sequence 


20, Appl 


2 


85 


5.9 


310 


2 


us- 


08- 


414 


-657D-45 


Sequence 


45, Appl 


3 


85 


5.9 


338 


2 


US- 


08- 


414 


-657D-42 


Sequence 


42, Appl 


4 


85 


5.9 


338 


2 


US- 


08- 


414 


-657D-43 


Sequence 


43, Appl 


5 


85 


5.9 


338 


4 


us- 


09- 


135 


-080-4 


Sequence 


4, Appli 


6 


84 


5.8 


258 


4 


us- 


09- 


328 


-352-4253 


Sequence 


4253, Ap 


7 


83 


5.8 


764 


2 


us- 


08- 


177 


-109A-2 


Sequence 


2, Appli 


8 


83 


5.8 


764 


2 


us- 


08- 


687 


-706-2 


Sequence 


2, Appli 


9 


82.5 


5.7 


338 


4 


us- 


09- 


976 


-594-404 


Sequence 


404, App 


10 


81 


5.6 


797 


3 


us- 


09- 


182 


-728A-2 


Sequence 


2, Appli 


11 


81 


5.6 


797 


4 


us- 


09- 


795 


-232-2 


Sequence 


2, Appli 


12 


80.5 


5.6 


150 


4 


us- 


09- 


252 


-991A-16958 


Sequence 


16958, A 



1 *3 


ft n ^ 

OU.J 


D . 


a 

O 


JU4 


o 
^. 




08- 


414- 


657n_4 4 


fcj - L4p V^- X A 


44. AdtdI 


1 A 
Xfl 


o U . D 


D . 


O 


^9 S 


o 


us- 


08- 


414- 


657D-2 


SecjUGnce 


2 Annli 


1 ^ 


ft n r 

0 U . D 


D . 


6 


^9 S 


o 


US- 


08- 


414- 


657D-41 


Sequence 


41, Appl 


1 D 


OU.J 
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J 




TTQ- 
U *J 


09- 


135- 
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Spnupn ce 


2, Appli 


1 / 


ft n ^ 

OU.J 


D . 
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1 JO 
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TT^ — 


09- 
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3 8 9 A- 4 




4, Appli 
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fl D O 
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09- 
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1 0- 
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595-18 


Q rf 1 1 ^ n 


1 8 Annl 
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o U . D 
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1 0- 
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o U . D 
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o X D lO 
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4 SQ- 

fi O J? 




O "M Lidlv^C 
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27 


8U 


c 

D . 


b 


ten 


4 


TTC 


fl Q — 

u y — 


4 J J 


n^s— ft 
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7 Q 9 9 An 


31 
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Uo 
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1 Q1 - 
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fl D o X Z fl 
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D . 


D 


ZoZ 


4 


TTQ — 
U O 


n Q- 
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Z DZ 


QQ1 a— 1 Q777 
i? ^ X/\ ±z? / f / 




19777 , A 


O A 

34 


/ O . D 


D . 


D 


33 cf 


9 
Z 


TTQ — 
Uo 


n ft _ 

Uo 


fl X fi 


DJ /u d u 


OC^UCIILC 


60 Annl 


3D 
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I o . D 


D . 


D 


oop 

ooo 


/I 


TTQ — 
Uo 


u y 


1 ^ S- 

IjJ 


Oft fl — PI 
uou o 


OCLjUCllL'C 


ft Annl i 


o el 
36 


9 O 
/ O 


D . 


A 

4 


A Q Q 

4 y o 


4 


TTQ — 

Uo 


u y 


± J.Z 


4 Q ft A— 9 

fl zj Or\ £. 


Q d ^Ti l a n o 
JCIJUC11L.C 


9 Annl "i 


O ""7 

3 / 


/ / 


D . 


/I 
4 


7 /l 
/ 4 




TTQ — 

U O 
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9 S9 - 
Z DZ 
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Q ci ipnr"p 


^1718 A 

OX / 1U/ /\ 


39 


/ / 


D . 


4 


3 / y 


4 


TTQ — 

Uo 


u y 


9^9- 
Z DZ 




Q d /^ri i a ti t~* 


96^S7 A 


4 U 


1 1 


D . 


A 

4 
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Uo 
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O _7 D x u 


OCULLCilL'C 


961 0 An 

Zj \J X \J f ■*ft' 


41 


11 


5. 


4 


797 


4 


US- 


09- 


191- 


-468-120 


Sequence 


120, App 


42 


11 


5. 


4 


797 


4 


us- 


■09- 


191- 


■468-122 


Sequence 


122, App 


43 


76 . 


5. 


3 


467 


4 


us- 


-09- 


411- 


-132A-4 


Sequence 


4, Appli 


44 


76 


5. 


3 


488 


4 


us- 


-09- 


252- 


-991A-26323 


Sequence 


26323, A 


45 


76 


5. 


3 


525 


1 


us- 


■08- 


-356- 


-340-2 


Sequence 


2, Appli 



ALIGNMENTS 



RESULT 1 

US-10-164-595-20 

; Sequence 20, Application US/10164595 
; Patent No. 6657054 
; GENERAL INFORMATION: 

; APPLICANT: OriGene Technologies, Inc 

; TITLE OF INVENTION: Regulated Angiogenesis Genes and Polypeptides 
; FILE REFERENCE: 1U 103 Rl 

; CURRENT APPLICATION NUMBER: US/10/164,595 

; CURRENT FILING DATE: 2002-06-10 

; NUMBER OF SEQ ID NOS : 80 

; S0FTW7\RE: Patentln version 3.1 

; SEQ ID NO 20 

; LENGTH: 1023 

; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-164-595-20 



Query Match 



5.9%; Score 85.5; DB 4; Length 1023; 



Best Local Similarity 20.0%; Pred. No. 10; 

Matches 68; Conservative 32; Mismatches 103; Indels 137; Gaps 14 



Qy 3 ILKGSPNVIP RAHGQ KNTRRDGTGLYPMRGPFKNLALLPFSLPLL- 47 

: I I : I I III: I I : : I II I : : I : I I 
Db 328 VSSGKPSVAPKPAANRASGEWDSGTENRLKVTSKEGLTPYP PLQEAGSIPVTKPELP 384 

Qy 48 GGG GSGSGEKV SVSKMAAAWPSGPSAPE 75 

II Mhll III : I I I I 

Db 385 KKPNPGLIRSWPEIPGRGPLAESSDSGKKVPTPAPRPLLLKKSVSSENPTYPSAPLKPV 444 

Qy 76 AVTARLVG VLW FVSVTT G PWGAVAT S A 102 

I I I I : : | | I : I : 

Db 445 TVPPRLAGASQ7VKAYKSLGEGPPANPPVPVLQSKPLVDIDLISFDDDVLPTPSGNLAEES 504 

Qy 103 GGEESL KCEDLK VGQYICKDPKINDATQEPVNCTNYTAHVSCFPAPNIT 151 

||: I I : I I : I : : I I I I = 

Db 505 VGSEMVLDPFQLPAKTEPIKERAVQPAPTRKPTVIRIPAKPGKC LHEDPQSPPPLP 560 

Qy 152 CKDSSGN ETHFTGNEVGFFK PI SCRNVNGYSYKVAVALS 190 

: || I : : I I I : : I I I I : : 
Db 561 AEKPIGNTFSTVSGKLSNVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGHLIMTTI 617 

Qy 191 LFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILI 230 

II: I : I I I I : : I I : 

Db 618 LFMSCSARAR MGFTGIVHILRFKLL 642 



RESULT 2 

US-08-414-657D-45 

Sequence 45, Application US/08414657D 
Patent No. 5861283 
GENERAL INFORMATION: 

APPLICANT: Levitt, Pat 
APPLICANT: Pimenta, Aurea 
APPLICANT: Fischer, Itzhak 
APPLICANT: Zhukareva, Victoria 

TITLE OF INVENTION: Limbic System- Associated Membrane 
TITLE OF INVENTION: Protein and DNA 
NUMBER OF SEQUENCES : 60 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Dechert Price & Rhoads 
STREET: 997 Lenox Drive, Building 3, Suite 210 
CITY : Lawrenceville 
STATE: NJ 
COUNTRY: USA 
ZIP : 08543 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/4 14 , 657D 
FILING DATE: 31-MAR-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 



APPLICATION NUMBER: 
; FILING DATE: 

ATTORNEY/AGENT INFORMATION: 

NAME: Bloom, Allen 

REGISTRATION NUMBER: 29,135 
; REFERENCE/DOCKET NUMBER: 317743-102 

TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 609-520-3214 

; TELEFAX: 609-520-3259 

TELEX: 

; INFORMATION FOR SEQ ID NO: 45: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 310 amino acids 
; TYPE: amino acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

US-08-414-657D-45 



Query Match 5.9%; Score 85; DB 2; Length 310; 

Best Local Similarity 27.7%; Pred. No. 2.1; 

Matches 36; Conservative 15; Mismatches 47; Indels 32; Gaps 7 

Qy 101 SAGGEESLKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: I : I I I I I I : : I : I I I : I I I : I : : 

Db 202 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE GQS SLTVTNVT-EEHY 257 

Qy 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALG 209 

Ml I :: I I : I I I : I I I : II I II I 
Db 258 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL WL LA 300 



Qy 210 LLKFCTVGFC 219 

II: I 

Db 301 ASLFCLLSKC 310 



RESULT 3 

US-08-414-657D-42 

Sequence 42, Application US/08414657D 
Patent No. 5861283 
GENERAL INFORMATION: 

APPLICANT: Levitt, Pat 
APPLICANT: Pimenta, Aurea 
APPLICANT: Fischer, Itzhak 
APPLICANT: Zhukareva, Victoria 

TITLE OF INVENTION: Limbic System- Associated Membrane 
TITLE OF INVENTION: Protein and DNA 
NUMBER OF SEQUENCES: 60 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Dechert Price 
STREET: 997 Lenox Drive, 
CITY: Lawrenceville 
STATE : N J 
COUNTRY: USA 
ZIP : 08543 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 



& Rhoads 
Building 3, Suite 210 



; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/414, 65 7D 
FILING DATE: 31-MAR-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
; NAME: Bloom, Allen 

REGISTRATION NUMBER: 29,135 
REFERENCE/ DOCKET NUMBER: 317743-102 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 609-520-3214 
TELEFAX: 609-520-3259 
; TELEX: 

; INFORMATION FOR SEQ ID NO: 42: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 338 amino acids 
; TYPE: amino acid 

; STRANDEDNESS: single 

TOPOLOGY: linear 
US-08-414-657D-42 

Query Match 5.9%; Score 85; DB 2; Length 338; 

Best Local Similarity 27.7%; Pred. No. 2.4; 

Matches 36; Conservative 15; Mismatches 47; Indels 32; Gaps 7 

Qy 101 SAGGEESLKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: I : Mill I :: I :|| I : I I 1:1 

Db 230 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE GQSSLTVTNVT-EEHY 285 

Qy 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALG 209 

Ml I :: I I : I I I : I I I : I I I II I 
Db 286 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL WL LA 328 

Qy 210 LLKFCTVGFC 219 

II: I 

Db 32 9 ASLFCLLSKC 338 



RESULT 4 

US-08-414-657D-43 

Sequence 43, Application US/08414657D 
Patent No. 5861283 
GENERAL INFORMATION: 

APPLICANT: Levitt, Pat 
APPLICANT: Pimenta, Aurea 
APPLICANT: Fischer, Itzhak 
APPLICANT: Zhukareva, Victoria 

TITLE OF INVENTION: Limbic System-Associated Membrane 
TITLE OF INVENTION: Protein and DNA 
NUMBER OF SEQUENCES: 60 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Dechert Price & Rhoads 
STREET: 997 Lenox Drive, Building 3, Suite 210 



; CITY: Lawrenceville 

; STATE: NJ 

; COUNTRY: USA 

ZIP: 08543 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

; OPERATING SYSTEM: DOS 

; SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/4 14 , 657D 
; FILING DATE: 31-MAR-1995 

; CLASSIFICATION: 435 

; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
; NAME: Bloom, Allen 

; REGISTRATION NUMBER: 29,135 

; REFERENCE/ DOCKET NUMBER: 317743-102 

TELECOMMUNICATION INFORMATION: 

TELEPHONE: 609-520-3214 

TELEFAX: 609-520-3259 
; TELEX: 

; INFORMATION FOR SEQ ID NO: 43: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 338 amino acids 

TYPE: amino acid 
; STRANDEDNESS : single 

; TOPOLOGY: linear 

US-08-414-657D-43 



Query Match 5.9%; Score 85; DB 2; Length 338; 

Best Local Similarity 27.7%; Pred. No. 2.4; 

Matches 36; Conservative 15; Mismatches 47; Indels 32; Gaps 7; 

Qy 101 SAGGEESLKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: I : I I I I I I : : I : I I I : I I I : I : : 

Db 230 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE GQSSLTVTNVT-EEHY 28 5 

Qy 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALG 209 

Ml |::| I : I I I : I I I : I I I II I 
Db 286 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL WL LA 328 

Qy 210 LLKFCTVGFC 219 

II: I 

Db 329 ASLFCLLSKC 338 



RESULT 5 
US-09-135-080-4 

; Sequence 4, Application US/09135080 

; Patent No. 6423827 

; GENERAL INFORMATION: 

; ■ APPLICANT: Levitt, Pat R. 

; APPLICANT: Pimenta, Aurea 

; APPLICANT: Fischer, Itzhak 



; APPLICANT: Zhukareva, Victoria 

; TITLE OF INVENTION: Limbic System-Associated Membrane 

TITLE OF INVENTION: Protein and DNA 
; NUMBER OF SEQUENCES: 29 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Dechert Price & Rhoads 
; STREET: 997 Lenox Drive, Building 3, Suite 210 

; CITY: Lawrenceville 

STATE: NJ 

COUNTRY: USA 
; ZIP: 08543 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ for Windows Version 2.0 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 135 , 08 0 

FILING DATE: 17-AUG-1998 

CLASSIFICATION: 424 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/414,657 

FILING DATE: 31-MAR-1995 
; ATTORNEY/AGENT INFORMATION: 
; NAME: Bloom, Allen 

REGISTRATION NUMBER: 29,135 

REFERENCE/ DOCKET NUMBER: 317743-102A 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: 609-620-3214 

TELEFAX: 609-620-3259 

TELEX: 

INFORMATION FOR SEQ ID NO: 4: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 338 amino acids 

; TYPE: amino acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
US-09-135-080-4 



Query Match 5.9%; Score 85; DB 4; Length 338; 

Best Local Similarity 27.7%; Pred. No. 2.4; 

Matches 36; Conservative 15; Mismatches 47; Indels 32; Gaps 7 

r 101 S AGGEES LKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: I : I I I I I I : : I : I I I : I I I : I = = 

> 230 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE GQSSLTVTNVT-EEHY 285 

j 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALG 209 

Ml I : : I I : I I I : I I I : I I I II I 

> 286 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL WL LA 328 



Qy 210 LLKFCTVGFC 219 

II: I 

Db 329 ASLFCLLSKC 338 



RESULT 6 



US-09-328-352-4253 

Sequence 4253, Application US/09328352 
Patent No. 6562958 
GENERAL INFORMATION: 
APPLICANT: Gary L. Breton et al . 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
ACINETOBACTER 

TITLE OF INVENTION: BAUMANNII FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: GTC99-03PA 

CURRENT APPLICATION NUMBER: US/09/328,352 
CURRENT FILING DATE: 1999-06-04 
NUMBER OF SEQ ID NOS : 8252 
SEQ ID NO 4253 
LENGTH: 258 
TYPE: PRT 

ORGANISM: Acinetobacter baumannii 
US-09-328-352-4253 

Query Match 5.8%; Score 84; DB 4; Length 258; 

Best Local Similarity 23.5%; Pred. No. 2; 

Matches 52; Conservative 24; Mismatches 93; Indels 52; Gaps 10; 

Qy 78 TARLVGVL WFVSVTTGPW GAVAT S AGGE E S LKC E D L KV G 116 

I I I I I : I I : I | | | :: | I 

Db 57 TTGLYGPLNVEWTTRLERGPYWSEKIDEKGTFFRGAPGSISIRSPDYPSIPGQPAATDGG 116 

Qy 117 QYICKDPKINDATQEPVNCTNY — TAHVSC-FPAPNITCKDSSGNETHFTGNEVGFFKPI 173 

1:1111 III III I : I : I I I : : I 

Db 117 FYLPKDPK EPVKI YRYFTT KAVP VEVP S DNVTC NTLAYTKEP 158 

Qy 174 SCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQ 233 

: | : | : | : I : I I : I I I I I : I : : 

Db 159 AS H KVS LVS FATAGT VGGVT GAI I GKN F S S GNMS YGQ AT GAGAAGGAI GGL I VAAI I NAE 218 

Qy 234 IVG — PSDGSSYIIDYYGTRLTRLSITNETFRKTQLYP 269 

I : I I ||:: : I I I : : I I 

Db 219 VGKIIGGLPIKESSFM EKLRELGAKREPLKQI SLLP 254 



RESULT 7 

US-08-177-109A-2 

; Sequence 2, Application US/08177109A 

; Patent No. 5869615 

; GENERAL INFORMATION: 

; APPLICANT: Dennis E. Hourcade and Teresa J. Oglesby 
; TITLE OF INVENTION: MODIFIED COMPLEMENT PROTEASES 

NUMBER OF SEQUENCES: 62 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Patrea L. Pabst 

; STREET: 2800 One Atlantic Center 

; STREET: 1201 West Peachtree Street 

CITY: Atlanta 
; STATE: Georgia 

; COUNTRY: USA 

; ZIP : 30309-3450 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 



COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/ 177 , 109A 
; FILING DATE: 03-JAN-1994 

; CLASSIFICATION: 514 

; ATTORNEY/AGENT INFORMATION: 
; NAME: Pabst, Patrea L. 

REGISTRATION NUMBER: 31,284 

REFERENCE/ DOCKET NUMBER: WU 107 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (404) 873-8794 
; TELEFAX: (404) 873-8795 

INFORMATION FOR SEQ ID NO: 2: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 7 64 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
; MOLECULE TYPE: protein 

HYPOTHETICAL: NO 
US-08-177-109A-2 



Query Match 5.8%; Score 83; DB 2; Length 764; 

Best Local Similarity 24.1%; Pred. No. 12; 

Matches 49; Conservative 21; Mismatches 71; Indels 62; Gaps 12 

Qy 24 GTGLYPMRGPFKNLALLPFSLPLLGGG GSGSGEKVSV 60 

I : I I 11:1111111 Mill: 

Db 2 GSNLSP QLCLMPFI LGLLS GG VTTTPWSLAQPQGS CSLEGVEI KGGS FRLLQEG 55 

Qy 61 S KMAAAW P S G — P SAP EAVT ARLVGVLW FVS VT T GPWGAVAT S AGGEESLKC — 110 

: III I : I I :M I : I :: I 

Db 56 QALEYVCPSGFYPYPVQTRTCR STGSWSTLKTQDQKTVRKAECRAIHCPR 105 

Qy 111 -EDLKVGQYICKDPKINDATQEPVNC-TNYTAHVSCFPAPNITCKDSS — GNETHFTGNE 166 

I : I : I : I I : : : I II I III:: : I I 

Db 106 PHDFENGEYWPRS PYYNVSDEI S FHCYDGYTLRGSA NRTCQVNGRWSGQTAICDNG 161 



Qy 167 VGFFK PISCRNVNGYSYKV 185 

I : II III I : : 

Db 162 AGYCSNPGIPIGTRKV-GSQYRL 183 



RESULT 8 
US-08-687-706-2 

; Sequence 2, Application US/08687706 

; Patent No. 5928892 

; GENERAL INFORMATION: 

APPLICANT: Dennis E. Hourcade and Teresa J. Oglesby 
TITLE OF INVENTION: MODIFIED COMPLEMENT PROTEASES 

; NUMBER OF SEQUENCES: 62 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Patrea L. Pabst 
STREET: 2 800 One Atlantic Center 

; STREET: 1201 West Peachtree Street 

; CITY: Atlanta 



; STATE: Georgia 

; COUNTRY: USA 

ZIP: 30309-3450 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 687 , 706 
; FILING DATE: 26-JUL-1996 

; CLASSIFICATION: 514 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/177,109 

FILING DATE: 03-JAN-1994 

CLASSIFICATION: 514 
; ATTORNEY/AGENT INFORMATION: 
; NAME: Pabst, Patrea L. 

; REGISTRATION NUMBER: 31 , 284 

REFERENCE/ DOCKET NUMBER: WU 107 DIV 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (404) 873-8794 

TELEFAX: (404) 873-8795 
; INFORMATION FOR SEQ ID NO: 2: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 7 64 amino acids 

; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
; HYPOTHETICAL: NO 
US-08-687-706-2 

Query Match 5.8%; Score 83; DB 2; Length 764; 

Best Local Similarity 24.1%; Pred. No. 12; 

Matches 49; Conservative 21; Mismatches 71; Indels 62; Gaps 12; 

Qy 24 GTGLYPMRGPFKNLALLPFSLPLLGGG GSGSGEKVSV 60 

I : I I Ihlllllll Mill: 

Db 2 GSNLSP QLCLMPFILGLLSGGVTTTPWSLAQPQGSCSLEGVEIKGGSFRLLQEG 55 

Qy 61 SKMAAAWPSG — P SAP EAVT ARLVGVLW FVS VTT G P WGAVAT S AGGEESLKC — 110 



Db 



56 QALEYVCPSGFYPYPVQTRTCR- 



STGSWSTLKTQDQKTVRKAECRAIHCPR 105 



Qy 



111 -EDLKVGQYICKDPKINDATQEPVNC-TNYTAHVSCFPAPNITCKDSS — GNETHFTGNE 166 



Db 



I • i - i i i • • -i ii i ill** • i i 

106 PHDFENGEYWPRSPYYNVSDEISFHCYDGYTLRGSA NRTCQVNGRWS GQTAI CDNG 161 



QY 



167 VGFFK PISCRNVNGYSYKV 185 



Db 



162 AGYCSNPGIPIGTRKV-GSQYRL 183 



RESULT 9 

US-09-976-594-404 

; Sequence 404, Application US/09976594 
; Patent No. 6673549 



GENERAL INFORMATION: 
APPLICANT: Furness, Michael 
APPLICANT: Buchbinder, Jenny 

TITLE OF INVENTION: GENES EXPRESSED IN C3A LIVER CELL CULTURES TREATED WITH 
STEROIDS 

FILE REFERENCE: PA-0041 US 

CURRENT APPLICATION NUMBER: US/09/976, 594 
CURRENT FILING DATE: 2001-10-12 
PRIOR APPLICATION NUMBER: 60/240,409 
PRIOR FILING DATE: 2000-10-12 
NUMBER OF SEQ ID NOS : 1143 
SOFTWARE: PERL Program 
SEQ ID NO 404 
LENGTH: 338 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/ KEY: misc_feature 

OTHER INFORMATION: Incyte ID No. 6673549 1640555CD1 
US-09-976-594-404 

Query Match 5.7%; Score 82.5; DB 4; Length 338; 

Best Local Similarity 29.6%; Pred. No. 4.2; 

Matches 37; Conservative 14; Mismatches 47; Indels 27; Gaps 7 

Qy 101 SAGGEESLKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: | : I I I I I I : : I : I I I : I I I : I : : 

Db 230 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE GQSSLTVTNVT-EEHY 285 

Q y 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALG 209 

M I I : : I I : I I I : I I I : I I I II I I 

Db 286 GN YT C VAAN KL GVTN AS L VL FRP G S VRG I N G- S I S LAVP L WL LAASLLC 333 

Qy 210 LLKFC 214 

I I I 

Db 334 LLSKC 338 



RESULT 10 
US-09-182-728A-2 

Sequence 2, Application US/09182728A 
Patent No. 6238883 
GENERAL INFORMATION: 
APPLICANT: BROWN, ANTHONY 
APPLICANT: CHAPMAN, CONRAD GERALD 
APPLICANT: GLOGER, ISRAEL SIMON 
APPLICANT: EVANS, JOANNE RACHEL 
APPLICANT: CAIRNS, WILLIAM 
APPLICANT: HERDON, HUGH 
TITLE OF INVENTION: NOVEL COMPOUNDS 
FILE REFERENCE: GP-3017 6 

CURRENT APPLICATION NUMBER: US/09/182, 728A 
CURRENT FILING DATE: 1998-10-29 
PRIOR APPLICATION NUMBER: 9818890.7 
PRIOR FILING DATE: 1998-08-28 
NUMBER OF SEQ ID NOS: 6 

SOFTWARE: FastSEQ for Windows Version 3.0 



; SEQ ID NO 2 
; LENGTH: 797 
; TYPE: PRT 

ORGANISM: HOMO SAPIENS 
US-09-182-728A-2 

Query Match 5.6%; Score 81; DB 3; Length 797; 

Best Local Similarity 23.9%; Pred. No. 21; 

Matches 47; Conservative 32; Mismatches 80; Indels 38; Gaps 11 

Qy 87 EVSVTTGPWGAVATSAGGEESLKCED LKVGQYI CKD- PKINDATQEPVNCTNYTAHV 142 

I I I I III: I hi I : : I I I I I I : 

Db 302 FVSVL — PWGSCNNPWNTPE CKDKTKLLLDSCVI SDHPKI QIKNSTFCM 348 

Qy 143 SCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSY KVAVALSLFLGWLGA 198 

: : I : I I : I : I : I : I I : I I : : I I I I I I : 

Db 349 TAYPNWMWFTSQANKTFVSGSE-EYFKYFVLKISAGIEYPGEIRWPLALCLFLAWV — 405 

Qy 199 DRFYLGYPAL GLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYIIDYYGTR 251 

: I : I | : : | | : | : : : : : | III : 

Db 406 IVYASLAKGIKTSGKVVYFTATFPYV-VLVILLIRGWLPGAGAGIWYFITPKWEK 4 60 

Qy 252 LTRLSITNETFRKTQLY 2 68 

II :: : II:: 

Db 461 LTNATVWKDA — ATQIF 475 



RESULT 11 
US-09-795-232-2 

; Sequence 2, Application US/09795232 

; Patent No. 6426405 

; GENERAL INFORMATION: 

; APPLICANT: Anthony M. Brown 

; APPLICANT: Conrad Gerald Chapman 

; APPLICANT: Israel Simon Gloger 

APPLICANT: Joanne Rachel Evans 
; APPLICANT: William Cairns 
; APPLICANT: Hugh Jonathan Herdon 
; TITLE OF INVENTION: NOVEL COMPOUNDS 

FILE REFERENCE: GP-30176-D1 
; CURRENT APPLICATION NUMBER: US/09/7 95,232 
; CURRENT FILING DATE: 2001-02-28 
; PRIOR APPLICATION NUMBER: 09/182,728 
; PRIOR FILING DATE: 1998-10-29 
; PRIOR APPLICATION NUMBER: 9818890.7 
; PRIOR FILING DATE: 1998-08-28 
; NUMBER OF SEQ ID NOS : 6 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 2 

LENGTH: 797 

TYPE: PRT 

ORGANISM: HOMO SAPIENS 
US-09-795-232-2 



Query Match 5.6%; Score 81; DB 4; Length 797; 

Best Local Similarity 23.9%; Pred. No. 21; 

Matches 47; Conservative 32; Mismatches 80; Indels 38; Gaps 11 



Qy 87 FVSVTTGPWGAVATSAGGEESLKCED LKVGQYICKD-PKINDATQEPVNCTNYTAHV 142 

I I I I III: I I : I I : : I I I I I I = 

Db 302 FVSVL— PWGSCNNPWNTPE CKDKTKLLLDSCVISDHPKI QIKNSTFCM 34 8 

Qy 143 SCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSY KVAVALSLFLGWLGA 198 

: : | : I I : I : I : I : I I : I I : : I I I I I I : 

Db 349 TAYPNWKVNFTSQANKTFVSGSE-EYFKYFVLKISAGIEYPGEIRWPLALCLFI^WV — 405 

Qy 199 DRFYLGYPAL GLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYIIDYYGTR 251 

: I : I | : : | | : |: :: : : I III : 

Db 406 1 VYAS LAKG I KT S GKWY FT AT FP YV- VLVI LL I RGVT L P GAGAG I W Y FIT PKW E K 460 

Qy 252 LTRLSITNETFRKTQLY 268 

II :: : I I :: 

Db 461 LTNATVWKDA — ATQIF 475 



RESULT 12 

US-09-252-991A-16958 

; Sequence 16958, Application US/09252991A 
; Patent No. 6551795 
; GENERAL INFORMATION: 

; APPLICANT: Marc J. Rubenfield et al . 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
PSEUDOMONAS 

; TITLE OF INVENTION: AERUGINOSA FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.136 

; CURRENT APPLICATION NUMBER: US/09/252, 991A 

; CURRENT FILING DATE: 1999-02-18 

; PRIOR APPLICATION NUMBER: US 60/074,788 

; PRIOR FILING DATE: 1998-02-18 

; PRIOR APPLICATION NUMBER: US 60/094,190 

; PRIOR FILING DATE: 1998-07-27 

; NUMBER OF SEQ ID NOS : 33142 

; SEQ ID NO 16958 

LENGTH: 150 

TYPE: PRT 

; ORGANISM: Pseudomonas aeruginosa 
US-09-252-991A-16958 

Query Match 5.6%; Score 80.5; DB 4; Length 150; 

Best Local Similarity 40.0%; Pred. No. 2.1; 

Matches 20; Conservative 7; Mismatches 22; Indels 1; Gaps 1 

Qy 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILI 230 

:| : | :| |: |: Ml I I I : I I I II Ml II 
Db 24 HSKAIGYLLWIF-GFTGSHRFYYGKPITGTIWFFTFGLFFIGWIIDLFLI 72 



RESULT 13 
US-08-414-657D-44 

; Sequence 44, Application US/08414657D 

; Patent No. 5861283 

; GENERAL INFORMATION: 

APPLICANT: Levitt, Pat 
APPLICANT: Pimenta, Aurea 



APPLICANT: Fischer, Itzhak 
; APPLICANT: Zhukareva, Victoria 

; TITLE OF INVENTION: Limbic Sys tern- Associated Membrane 

; TITLE OF INVENTION: Protein and DNA 

; NUMBER OF SEQUENCES: 60 

; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Dechert Price & Rhoads 

; STREET: 997 Lenox Drive, Building 3, Suite 210 

; CITY: Lawrenceville 

STATE: NJ 

COUNTRY: USA 
; ZIP: 08543 

; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Diskette 

; COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/414 , 657D 

FILING DATE: 31-MAR-1995 
; CLASSIFICATION: 435 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 
; ATTORNEY/AGENT INFORMATION: 
; NAME: Bloom, Allen 

; REGISTRATION NUMBER: 29,135 

REFERENCE/ DOCKET NUMBER: 317743-102 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 609-520-3214 

TELEFAX: 609-520-3259 
; TELEX: 

INFORMATION FOR SEQ ID NO: 44: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 304 amino acids 

; TYPE: amino acid 

STRANDEDNESS : single 
; TOPOLOGY: linear 

US-08-414-657D-44 

Query Match 5.6%; Score 80.5; DB 2; Length 304; 

Best Local Similarity 29.9%; Pred. No. 5.8; 

Matches 32; Conservative 14; Mismatches 40; Indels 21; Gaps 6 

Qy 101 SAGGEESLKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: I : I I I I I I : : I : I I I : I I I : I : : 

Db 202 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE— GQSSLTVTNVT-EEHY 257 

Qy 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWL 196 

I I I I :: I | : | | | : | | | :|| I I I 

Db 258 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL— WL 298 



RESULT 14 
US-08-414-657D-2 

; Sequence 2, Application US/08414657D 
; Patent No. 5861283 



GENERAL INFORMATION: 

APPLICANT: Levitt, Pat 
APPLICANT: Pimenta, Aurea 
APPLICANT: Fischer, Itzhak 
APPLICANT: Zhukareva, Victoria 

TITLE OF INVENTION: Limbic System- Associated Membrane 
TITLE OF INVENTION: Protein and DNA 
NUMBER OF SEQUENCES: 60 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Dechert Price & Rhoads 
STREET: 997 Lenox Drive, Building 3, Suite 210 
CITY : Lawrenceville 
STATE : N J 
COUNTRY: USA 
ZIP: 08543 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/4 14 , 657D 
FILING DATE: 31-MAR-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
ATTORNEY/ AGENT INFORMATION: 
NAME: Bloom, Allen 
REGISTRATION NUMBER: 29,135 
REFERENCE/ DOCKET NUMBER: 317743-102 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 609-520-3214 
TELEFAX: 609-520-3259 
TELEX : 

INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 325 amino acids 
TYPE: amino acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
FRAGMENT TYPE: internal 
US-08-414-657D-2 

Query Match 5.6%; Score 80.5; DB 2; Length 325; 

Best Local Similarity 29.9%; Pred. No. 6.4; 

Matches 32; Conservative 14; Mismatches 40; Indels 21; Gaps 6; 
Qy 101 SAGGEESLKCEDLKVG QYI CKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDS S 156 

: I : I I I II I : : I : I I I : I I I : I :: 

Db 223 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE GQSSLTVTNVT-EEHY 278 

Qy 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWL 196 

III I : : I I : I I I : I I I : I I I II 
Db 279 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL WL 319 



RESULT 15 



US-08-414-657D-41 

Sequence 41, Application US/08414657D 
Patent No. 5861283 
GENERAL INFORMATION: • 
APPLICANT: Levitt, Pat 
APPLICANT: Pimenta, Aurea 
APPLICANT: Fischer, Itzhak 
APPLICANT: Zhukareva, Victoria 

TITLE OF INVENTION: Limbic System- Associated Membrane 
TITLE OF INVENTION: Protein and DNA 
NUMBER OF SEQUENCES : 60 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Dechert Price & Rhoads 
STREET: 997 Lenox Drive, Building 3, Suite 210 
CITY : Lawrenceville 
STATE: NJ 
COUNTRY: USA 
ZIP: 08543 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/414 , 657D 
FILING DATE: 31-MAR-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
NAME: Bloom, Allen 
REGISTRATION NUMBER: 29,135 
REFERENCE/ DOCKET NUMBER: 317743-102 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 609-52 0-3214 
TELEFAX: 609-520-3259 
TELEX : 

INFORMATION FOR SEQ ID NO: 41: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 325 amino acids 
TYPE: amino acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
US-08-414-657D-41 

Query Match 5.6%; Score 80.5; DB 2; Length 325; 

Best Local Similarity 29.9%; Pred. No. 6.4; 

Matches 32; Conservative 14; Mismatches 40; Indels 21; Gaps 6 

Qy 101 SAGGEESLKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: I : Mill I :: I Ml I : I I IM :: 

Db 223 T T GRQ AS L KC EAS AVP AP D FEW YRD DT RI N S AN GLE I K S T E GQ SSLTVTNVT-EEHY 278 

Qy 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWL 196 

M I I : M I M I I M I I MM II 
Db 279 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL WL 319 



Search completed: March 4, 2004, 10:28:43 
Job time : 44 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



March 4, 2004, 10:16:25 ; Search time 44 Seconds 

(without alignments) 
588.080 Million cell updates/sec 

US-09-852-100B-2 
1439 

1 MHILKGSPNVIPRAHGQKNT TRLTRLS ITNETFRKTQLYP 2 69 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



283366 



Database 



PIRJ78:* 
pirl : * 
pir2 : * 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
S44605 

C02F5.3 protein - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 20-Feb-1995 #sequence_revision 20-Feb-1995 #text_change 02-Feb-2001 
C; Accession: S44605 
R;Anderson, K. 

submitted to the EMBL Data Library, May 1993 

A; Description: Sequence of the C. elegans cosmid C02F5. 

A; Reference number: S44603 

A; Accession: S44605 

A; Status: preliminary 

A;Molecule type: DNA 

A; Residues: 1-573 <AND> 

A;Cross-references: EMBL:L14745; NID:g289607; PID:g289610 
C; Genetics : 

A;Introns: 224/2; 304/1; 363/3; 390/3; 503/2 

C; Super family: translation elongation factor Tu homology 

C; Keywords: GTP binding; nucleotide binding; P-loop 

F; 63-183/Domain: translation elongation factor Tu homology <ETU> 



F; 69-76/Region: nucleotide-binding motif A (P-loop) 
F;246-249/Region: GTP-foinding NKXD motif 



Query Match 11.6%; Score 167.5; DB 2; Length 573; 

Best Local Similarity 27.9%; Pred. No. 1.3e-06; 

Matches 50; Conservative 23; Mismatches 57; Indels 49; Gaps 5; 

Qy 90 VTTGPWGAVATSAGGEESLKCEDLKVGQYICKDP K 124 

|:| I I I ::| I: :|:|| I 

Db 415 VSTNPLGPV VECRFLENS FI LCEDPVPLYGPGQTGQQPANES FRNEGKCLK 465 

q y 125 INDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYK 184 

: I I I I II III II M I M : : 

Db 466 MGGYRAEDVEFTN VKCRVLPCIEC HGPRT FTKSTPCI I YNGHYFL 510 

Qy 185 VAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSY 243 

: : III III I: : I |:| II ::l |: : ::||:| II: 

D b 511 TTLLYSIFLGWAVDRFCLGYSAMAVGKIJytTLGGFGIWWIVDIFLLVLGVLGPAJDDSSW 569 



RESULT 2 
T28787 

hypothetical protein C41D11.5 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 #text_change 29-Oct-1999 
C;Accession: T28787 
R;Gattung, S.; Maggi, L. 

submitted to the EMBL Data Library, May 1997 

A; Description: The sequence of C. elegans cosmid C41D11. 

A; Reference number: Z20522 

A;Accession: T28787 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-753 <GAT> 

A; Cross-references: EMBL: AF003740; PIDN: AAC48141 . 1 ; GSPDB : GN00019 ; CESP : C41D11 . 5 

A; Experimental source: strain Bristol N2 ; clone C41D11 

C; Genetics : 

A; Gene : CESP : C4 1D1 1 . 5 

A; Map position: 1 

A;Introns: 53/2; 81/3; 117/1; 250/3; 274/2; 357/3; 443/2; 485/3; 544/3; 585/3; 
637/2 

Query Match 11.1%; Score 159.5; DB 2; Length 753; 

Best Local Similarity 28.2%; Pred. No. 8.6e-06; 

Matches 46; Conservative 27; Mismatches 61; Indels 29; Gaps 5; 

Qy 104 GEESLKCE DLKVGQYI CKDPKINDATQE PVNCTNYTA HVSCF 145 

| || I : : I : I : I : : : II: I I I 
D b 284 GSAGLTCTFPGDCRIGDTV KVNCT S RKGCPN PVS RNNVEAVCRFCWQLLPGD YDCE 339 

Qy 146 PAPNITCKDS S GN ET H FT GN E VG F FK P I S C RN VN G Y S Y KVAVAL S L FL GWL G A 198 

|||: : I : : : I : I I I : I I I : : I I : II I I 

Db 340 P ATNC ST S S T KLLVT KC S AH S S VI CMGQRN F YKRI P CNW S S G YS WT KTMI L S WLGGFGA 399 

Qy 199 DRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGS 241 

| | I I I I : I : I I : : I : I : I I : : : I I I I 

Db 400 DRFYLGLWKSAIGKLFSFGGLGVWTLVDWLIAVGYIKPYDGS 442 



RESULT 3 
H75286 

hypothetical protein - Deinococcus radiodurans (strain Rl) 
C; Species: Deinococcus radiodurans 

C;Date: 03-Dec-1999 #sequence_revision 03-Dec-1999 #text_change 31-Mar-2000 
C;Accession: H75286 

R;White, 0.; Eisen, J. A. ; Heidelberg, J.F.; Hickey, E.K.; Peterson, J.D.; 
Dodson, R.J.; Haft, D.H.; Gwinn, M.L.; Nelson, W.C.; Richardson, D.L.; Moffat, 
K.S.; Qin, H.; Jiang, L. ; Pamphile, W. ; Crosby, M. ; Shen, M. ; Vainathevan, J. J.; 
Lam, P.; McDonald, L. ; Utterback, T . ; Zalewski, C; Makarova, K.S.; Aravind, L.; 
Daly, M.J.; Minton, K.W. ; Fleischmann, R.D.; Ketchum, K.A. ; Nelson, K.E.; 
Salzberg, S.; Smith, H.O.; Venter, J.C.; Fraser, CM. 
Science 286, 1571-1577, 1999 

A; Title: Genome sequence of the radioresistant bacterium Deinococcus radiodurans 
Rl. 

A/Reference number: A75250; MUID: 20036896; PMID : 10567266 
A;Accession: H75286 
A; Status : preliminary 
A;Molecule type: DNA 
A; Residues: 1-309 <WHI> 

A; Cross-references: GB:AE002064; GB:AE000513; NID: g6460134 ; PIDN: AAF11880 . 1 ; 

PID:g6460145; TIGR:DR2326; GSPDB : GN00077 

A; Experimental source: strain Rl 

C; Genetics : 

A; Gene: DR2326 

A;Map position: 1 

Query Match 6.6%; Score 95; DB 2; Length 309; 

Best Local Similarity 21.7%; Pred. No. 1.1; 

Matches 73; Conservative 27; Mismatches 95; Indels 142; Gaps 16; 



Qy 


16 


GQKNTRRDGTGLYPM RGPFKNLALLPFS-LPLLGGGGSGSGEK VSVSKMAAAW 

| : : | | I : III 1 1 1 II 1 1 : 1 1 1 
GRLARQRKGLDFRPVAEGERGPV FSPTPPFGGRNSGPVRRVLSVMTDKDRDAG 


67 


Db 


11 


63 


Qy 


68 


PSG PSAPEAVTAR LVGVLWF 

Ml 1 1 1 1 1 II 

PSGNAPSWVDEVLSSSSSAPRPVEGRHGQTADPAQNPAGTAPGSGWDHWPQTDAARDLRL 


87 


Db 


64 


123 


Qy 


88 


VSVTTGPWGAVATSAGGEESLKCEDLKVGQYICKDPKINDATQEPVNCTNY 
1 : 1 1 II III : 1 1 : 1 


138 


Db 


124 


PGDPPRPAPPSFDSDDWAARAT — GGE VRDPQGRD 


156 


Qy 


139 


TAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYK 


184 


Db 


157 


1 : 1 |: :| |: 1 1 : 1 1 : 

PQESRTTVYSAAPQTDAWGDPVRPAPPAPVKPVRGQMGQSNGPAGLPVREDIA 


209 


Qy 


185 


VAVALSLFLGWLGADRFYLGYPALGLLKF-CTVGF CGIGSLI 

: 1 1 : 1 1 1 II : 1 1 1 1 III : 1 1 : 1 - 1 
QKKLIAGLLGIFLGSLGVHKFYLGQNGAGLLMLGWIGVWVLAIVLSLLTLGLGAIILFP 


225 


Db 


210 


269 


Qy 


226 


— DFILISMQIVGPSDGSSYII DY-YGTR 251 

I: : ::| :| 1: Mil: 
LAGFVTSVLGVIGLIEGILYLTKSDADFQRDYLYGNK 306 




Db 


270 





RESULT 4 
S55661 

hypothetical protein 66 - equine herpesvirus 2 
C; Species: equine herpesvirus 2 

C;Date: 27-Oct-1995 #sequence_revision 03-Nov-1995 #text_change 08-Oct-1999 
C;Accession: S55661 

R;Telford, E.A.R.; Watson, M.S.; Aird, H.C.; Perry, J.; Davison, A.J. 
J. Mol. Biol. 249, 520-528, 1995 

A; Title: The DNA sequence of equine herpesvirus 2. 

A; Reference number: S55594; MUID : 95302501 ; PMID: 7783207 

A;Accession: S55661 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A;Molecule type: DNA 
A; Residues: 1-456 <TEL> 

A;Cross-references: GB:U20824; NID:g695172; PIDN: AAC13854 . 1; PID:g695239 

A; Note: the nucleotide sequence was submitted to the EMBL Data Library, February 

1995 

Query Match 6.4%; Score 92; DB 2; Length 456; 

Best Local Similarity 26.2%; Pred. No. 3.2; 

Matches 42; Conservative 11; Mismatches 45; Indels 62; Gaps 7; 

Qy 47 LGGGGSGSGEKVSVSKMAAAWPSGPSAPEAVTARLVGVLWFVSVTTGPWGAVATSAGGEE 106 

I I I I I I I : I : I III 

Db 13 LGGGGGGGGD LLG GGEA 2 9 

Qy 107 S LKCEDLKVGQYIC-KDPKINDATQEPVNCTNYTAHV SCFPAPNITCKDS 155 

II |:||: I : I : I : II: Ml :ll I 

Db 30 DGLMRALCEGLRVGEEDCARFVLYGVAYWQGGRCPEWVAHITRCADLSCFATYLLTCHRS 89 

Qy 156 SGNETHFTGNEVGFFKPISCRNVNGYSYKVAVALSLFLGW 195 

I I I I I I : I I MINI: 
Db 90 GGCE — FTGGRVARDRLP S LRE SVEVLQSLFLAF 121 



RESULT 5 
VGIH59 

E2 glycoprotein precursor - murine hepatitis virus (strain A59) 
N;Alternate names: peplomer glycoprotein; spike glycoprotein 
C; Species: murine hepatitis virus, MHV 

C;Date: 31-Mar-1989 #sequence_revision 31-Mar-1989 #text_change 12-Apr-1996 
C;Accession: A27402 

R;Luytjes, W. ; Sturman, L.S.; Bredenbeek, P.J.; Charite, J.; van der Zeijst, 
B.A.M.; Horzinek, M.C.; Spaan, W.J.M. 
Virology 161, 479-487, 1987 

A; Title: Primary structure of the glycoprotein E2 of coronavirus MHV-A59 and 

identification of the trypsin cleavage site. 

A; Reference number: A27402; MUID : 88072088 ; PMID : 2825419 

A; Accession: A27402 

A;Molecule type: genomic RNA 

A; Residues: 1-1324 <LUY> 

C; Super family : coronavirus E2 glycoprotein 

C; Keywords: glycoprotein; transmembrane protein 

F;l-16/Domain: signal sequence ftstatus predicted <SIG> 

F; 17-1324/Product : E2 glycoprotein #status predicted <E2G> 

F; 17-7 17 /Product : 90B glycoprotein #status predicted <EGB> 



F;718-1324/Product: 90A glycoprotein #status predicted <EGA> 
F; 12 66- 12 8 6/ Domain : transmembrane ttstatus predicted <TMN> 

F; 31, 60, 192, 247, 357, 435,442,530,625, 657,665,688,737, 754,844,893,1126,1180,1190, 1 
209, 1225, 1246, 1318/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 6.3%; Score 91; DB 1; Length 1324; 

Best Local Similarity 23.6%; Pred. No. 13; 

Matches 59; Conservative 27; Mismatches 94; Indels 70; Gaps 13; 

Qy 25 TGLYPMRG- PFKNLALLP FSLPLLGGGGSGSGEKVSVSKMAAAWPSGPSA — 73 

1111:11:1111 III I I I : : I I I : I 

Db 66 TGYYPVDGSKFRNLALTGTNSVSLSWFQPPYLNQFNDGIFAK — VQNLKTSTPSGATAYF 123 

Qy 74 PEAWARLVGVXWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQY-ICKDPKINDATQEP 132 

II II : : I I : I : : I I I I I : I 
Db 12 4 PTIVIGSLFGYTSY-TWIEPYNGVIMAS VCQYTICQLP 161 

Qy 133 VNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNE-VGFF KPISCRNVNGYSYKVAV 187 

: I I I I I : : I I : III : : I 

Db 162 YTDCKPNTN GNKLIGFWHT D VK P P I C VL KRNFT LNVNA 199 

Qy 188 ALSLFLGWLGADRFYLGY PALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I : Ml I : I II:: : I I : I I : : 

Db 200 DAFYFHFYQHGGTFYAYYADKPSATTFLFSVY IGDILTQYYVLPFICNPTAGSTFA 255 

Qy 24 5 IDYYGTRLTR 254 

I : I I : 
Db 256 PRYWVTPLVK 265 



RESULT 6 
T08604 

hypothetical protein GRR1 - soybean 
C; Species: Glycine max (soybean) 

C;Date: ll-Jun-1999 #sequence_revision ll-Jun-1999 #text_change ll-Jun-1999 
C; Accession: TO 8 604 
R;Chen, W. ; Atherly, A. 

submitted to the EMBL Data Library, August 1997 
A; Reference number: Z15438 
A; Accession: TO 8 604 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-690 <CHE> 

A; Cross-references: EMBL: AF019910 ; NID: g2407789 ; PID:g2407790 
A; Experimental source: variety L85-3044; root 
C; Genetics : 
A; Gene: grrl 

Query Match 6.2%; Score 89.5; DB 2; Length 690; 

Best Local Similarity 24.6%; Pred. No. 8.4; 

Matches 43; Conservative 22; Mismatches 59; Indels 51; Gaps 7; 

Qy 85 LWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYICKDPKINDATQEPV--NCTNYTAHV 142 

|||:|| I : : : I : I : Ml I hi I I I I I I : 

Db 233 LWDVA-TVGDVGLI EI ASGCHQLEKLD LCKCPNISDKTLIAVAKNCPN-LAEL 283 



Qy 143 SCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFY 202 



I III III : : I I : | : I : 
Db 2 84 SIESCPNI GNEGLQAI GKCPNLRS I SI KNCSGVGDQ 319 

Qy 203 LGYPALGLLKFCTVGFCGI GSLI DFI L — I SMQI VGPSDGS S YI I DYYGTRLTRL 255 

I I : I II : :: : III : I : I I : I I 
Db 320 GVAGLL S S AS FALT KVKLE S LTVS DLS LAVI GH YGVAVT DL 360 



RESULT 7 
H75632 

Na(+) -linked D-alanine glycine permease - Deinococcus radiodurans (strain Rl) 
C; Species: Deinococcus radiodurans 

C;Date: 03-Dec-1999 #sequence_revision 03-Dec-1999 #text_change 17-Mar-2000 
C;Accession: H75632 

R;White, O* ; Eisen, J.A. ; Heidelberg, J.F.; Hickey, E.K.; Peterson, J.D.; 
Dodson, R.J.; Haft, D.H.; Gwinn, M.L.; Nelson, W.C.; Richardson, D.L.; Moffat, 
K.S.; Qin, H.; Jiang, L. ; Pamphile, W. ; Crosby, M. ; Shen, M. ; Vamathevan, J.J.; 
Lam, P.; McDonald, L.; Utterback, T . ; Zalewski, C; Makarova, K.S.; Aravind, L. ; 
Daly, M.J.; Minton, K.W.; Fleischmann, R.D.; Ketchum, K.A. ; Nelson, K.E.; 
Salzberg, S . ; Smith, H.O. ; Venter, J.C.; Fraser, CM. 
Science 286, 1571-1577, 1999 

A; Title: Genome sequence of the radioresistant bacterium Deinococcus radiodurans 
Rl. 

A; Reference number: A75250; MUID: 20036896; PMID: 10567266 
A;Accession: H75632 
A; Status : preliminary 
A;Molecule type: DNA 
A; Residues: 1-547 <WHI> 

A;Cross-references: GB:AE001826; NID : g6460827 ; PIDN: AAF12563 . 1 ; PID : g6460859 ; 

TIGR:DRB0133; GSPDB : GN00079 

A; Experimental source: strain Rl 

C; Genetics : 

A; Gene: DRB0133 

A; Map position: megaplasmid 

A; Genome: plasmid 

A;Note: plasmid MP1 

C;Superfamily: sodium-dependent D-alanine/glycine transport protein 

Query Match 6.1%; Score 87.5; DB 2; Length 547; 

Best Local Similarity 25.6%; Pred. No. 9.6; 

Matches 57; Conservative 19; Mismatches 52; Indels 95; Gaps 13; 

Qy 43 SLPLLGGGGSGSGEKVSVSKMAAA WPS — G P SAP EAVT ARL VGV 84 

Ml | | | :: : :: | | I I I I I I : I : I 

Db 29 SRPLSSESGSSSAQEPWMGRLPAALVFTGLLGAVSWASAQGPSVDERINAWTPVSHFLS 88 

Qy 85 -LWFVSVTTGP WGAVATSAGGEESLKCE DLKVGQYIC 120 

I I I : : I I I I I : : I I I I : I 

Db 89 GLI FAS I SVGEAQVPLI WWLAVA AI VCTLS FRFVNI WGFKHGI DLVRGRY — 139 

Qy 121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 180 

Ml I : I I I I : I : I 

Db 14 0 GNDA DAP GMVT H FQ ALT T AVS GT VGLGN I AG 170 

Qy 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLL KF — CTVG 217 

I I I I I I II II : : : I I I I I I I : I 

Db 171 VAVALS — LGGPGATFWMI LVGLLSMSTKFVECTLG 204 



RESULT 8 
T23754 

hypothetical protein T05C12.10 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 29-Oct-1999 
C;Accession: T23754; T24513 
R; Thomas, K. 

submitted to the EMBL Data Library, June 1995 
A/Reference number: Z19793 
A; Accession: T23754 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1207 <WIL> 

A;Cross-references: EMBL:Z49968; PIDN : CAA90265 . 1 ; GSPDB: GN00020; CESP : T05C12 . 10 
A; Experimental source: clone M110 
R; Burton, J. 

submitted to the EMBL Data Library, October 1995 
A;Reference number: Z19901 
A; Accession: T24513 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1207 <WI2> 

A;Cross-references: EMBL:Z66500; PIDN: CAA91313 . 1 ; GSPDB : GN00020 ; CESP : T05C12 . 10 
A; Experimental source: clone T05C12 
C; Genetics : 

A;Gene: CESP : T05C12 . 10 
A;Map position: 2 

A;Introns: 31/3; 87/2; 141/3; 180/2; 203/3; 267/1; 776/2; 794/2; 834/2; 1086/3; 
1143/1; 1181/1 

Query Match 6.1%; Score 87.5; DB 2; Length 1207; 

Best Local Similarity 21.1%; Pred. No. 23; 

Matches 47; Conservative 26; Mismatches 83; Indels 67; Gaps 8; 

Qy 2 HILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSVS 61 

I : I : I I : I I I : I I I I I I I I I I : 

Db 848 HNAESSASGI PLVQARSNTVNGGAPVPPAPGS GATGSGTSGSGTSESVT 896 

Qy 62 KMAAAWPSGPS APEAVTARLVGVLWFVSVTTGPWGAVATSAG 103 

: I I I : I I I : : I : I I I : I 

Db 897 NGSGATESGSTGSGTTGTGTSGTGSSGTGASAARTSSIAGDAPQAAVLADTPGAAGAAGG 956 

Qy 104 GE ESL KCEDLKVGQYICKDPKINDAT QEPVNCTNYTA 140 

I : I I : : : I : : I I : I : I I i I : 

Db 957 GRSNCFSADSLVTTVTGQKRMDELQIGDYVLVPSSGNVLKYEKVEMFYHREPKTRTNF— 1014 

Qy 141 HVSCFPAPNITCKDSSGNETHFTGNEVGFFKPIS-CRNVNGYS 182 

: ||: II: | : : | | I : 

Db 1015 WLYTKSGRKLSLTGRHL LPVAECSQVEQYT 1045 



RESULT 9 
S20911 

alcohol dehydrogenase (EC 1.1.1.1) II - yeast (Kluyveromyces marxianus var. 
lactis ) 



C; Species: Kluyveromyces marxianus var. lactis, Candida sphaerica 

C;Date: 20-Apr-2000 #sequence_revision 20-Apr-2000 #text_change 20-Apr-2000 

C;Accession: S20911; S19804 

R;Shain, D.H. ; Salvadore, C. ; Denis, C.L. 

Mol. Gen. Genet. 232, 479-488, 1992 

A; Title: Evolution of the alcohol dehydrogenase (ADH) genes in yeast: 

characterization of a fourth ADH in Kluyveromyces lactis. 

A; Reference number: S20911; MUID: 92269769; PMID: 1588917 

A; Accession: S2 0911 

A;Molecule type: DNA 

A; Residues: 1-348 <SHA> 

A; Cross-references: EMBL:X64397; NID:g2832; PIDN : CAA45739 . 1 ; PID:g2833 
C; Genetics : 
A; Gene: ADH2 

C;Superfamily: alcohol dehydrogenase; long-chain alcohol dehydrogenase homology 

C; Keywords: alcohol metabolism; metalloprotein; NAD; oxidoreductase; zinc 

F;29-336/Domain: long-chain alcohol dehydrogenase homology <LADH> 

F; 173-202/Region: beta-alpha-beta NAD nucleotide-binding fold 

F;44, 67, 154/Binding site: zinc, catalytic (Cys, His, Cys) #status predicted 

F; 98, 101, 104, 112/Binding site: zinc, noncatalytic (Cys) #status predicted 

Query Match 5.9%; Score 85.5; DB 1; Length 34 8; 

Best Local Similarity 19.9%; Pred. No. 8.5; 

Matches 65; Conservative 35; Mismatches 111; Indels 115; Gaps 13; 

RRDG TGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSVSKMAAAW 67 

: | ||: : I : II I I I : I I hi I : : : I 

KYSGVCHTDLHAWKGDWP L PT KL P LV- GGH EGAGVWAMGENVKGW 1 1 GD FAG I 91 



Qy 


19 


Db 


37 


Qy 


68 


Db 


92 


Qy 


79 


Db 


152 


Qy 


116 


Db 


212 


Qy 


166 


Db 


266 


Qy 


223 


Db 


325 



-PSGPSAPEAVT 78 
I I I 



'GPWGAVATSAGGEESLKCEDLKV 115 

I I I :: : I I II : I 



GQYICKDPKINDATQEPVNCTNYTAH VSCFPAPNITCKDSSGNETHFTGN 165 

I : I I : I I : I I I I III : I I I 

SLGGEYFVDYAVSKDLIKEIVDATNGGAHGVINVSVSEFAI EQSTNYVRSNGT 265 



I : : : I III I : I I I 



II : : : I I I I : : I 

1LADVYDKMVKGEIVG RYWD 345 



RESULT 10 
T35005 

probable integral membrane transporter - Streptomyces coelicolor 
C; Species: Streptomyces coelicolor 

C;Date: 05-Nov-1999 #sequence_revision 05-Nov-1999 #text_change 17-Mar-2000 
C;Accession: T35005 



R;Seeger, K.J.; Harris, D.; Bentley, S.D.; Parkhill, J.; Barrell, B.G.; 
Ra j andream, M.A. 

submitted to the EMBL Data Library, December 1998 
A;Reference number: Z21564 
A; Accession: T35005 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-446 <SEE> 

A; Cross-references: EMBL : AL034443 ; PIDN : CAA22367 . 1 ; GSPDB:GN00070; 
SCOEDB:SC4B5. 13 

A; Experimental source: strain A3 (2) 
C; Genetics : 

A; Gene: SCOEDB : SC4B5 . 13 

C; Super family: hypothetical protein c0103 

Query Match 5.8%; Score 84; DB 2; Length 446; 

Best Local Similarity 19.7%; Pred. No. 15; 

Matches 52; Conservative 34; Mismatches 110; Indels 68; Gaps 9; 

LLP FS LPLLGGGGS GSGEKVSVS KMAAAWP S GP S APEAVTARLVGVLWF — VS VTTGPWG 96 
|: :|| I I I II : :| I I : :: I :: II 

LI VLGYGFVGGI GLGI GYI SPVSTLI KWFPDRPG MATGI AIMGFGGGALI AS PWS 160 

AVATSAGGEES LKCEDLKVGQYICKDPK IN 12 6 

| : I :: : :| : : |: 

AQMLKSFGTDNSGIALAFLVHGLTYAVFMLLGVLLVRVPRPRERADGRPAPLEGVQVSAR 220 

DATQEP VNCTN YTAHVS C F — PAPNITCKDSSGNETHFTGNEVGFFKPISCRN 177 

I : I Mill: I I I I I : I I 'II 



Qy 


39 


Db 


106 


Qy 


97 


Db 


161 


Qy 


127 


Db 


221 


Qy 


178 


Db 


281 


Qy 


232 


Db 


330 



rLGW LGADRFYLGYPALGLLKFCTVGFCGI GSLIDFILI S 231 

I I : I | | : | | : : II I : I : 

-FGWSSASDLIGRKNIYRVYLGVGALMYTLIALFGDSSKPLFVLCA 329 



)IV GPSDGSSYIIDYYGT 250 

: I I : : I : I : I I 



RESULT 11 
C70574 

probable aroP2 protein - Mycobacterium tuberculosis (strain H37RV) 
C; Species: Mycobacterium tuberculosis 

C;Date: 17-Jul-1998 #sequence_revision 17-Jul-1998 #text_change 20-Jun-2000 
C;Accession: C70574 

R;Cole, S.T.; Brosch, R. ; Parkhill, J.; Garnier, T.; Churcher, C; Harris, D. ; 
Gordon, S.V.; Eiglmeier, K. ; Gas, S.; Barry III, C.E.; Tekaia, F. ; Badcock, K. ; 
Basham, D.; Brown, D. ; Chillingworth, T.; Connor, R. ; Davies, R. ; Devlin, K.; 
Feltwell, T . ; Gentles, S.; Hamlin, N.; Holroyd, S.; Hornsby, T. ; Jagels, K. ; 
Krogh, A.; McLean, J.; Moule, S.; Murphy, L.; Oliver, S.; Osborne, J.; Quail, 
M.A.; Rajandream, M.A. ; Rogers, J.; Rutter, S.; Seeger, K. ; Skelton, S.; 
Squares, S. 

Nature 393, 537-544, 1998 

A;Authors: Sqares, R. ; Sulston, J.E.; Taylor, K. ; Whitehead, S.; Barrell, B.G. 
A; Title: Deciphering the biology of Mycobacterium tuberculosis from the complete 
genome sequence. 



A; Reference number: A70500; MUID : 98295987 ; PMID: 9634230 
A;Accession: C70574 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 
A;Molecule type: DNA 
A; Residues: 1-487 <COL> 

A;Cross-references: GB:Z95324; GB:AL123456; NID : g3261760 ; PIDN : CAB08578 . 1 ; 
PID:g2094825 

A; Experimental source: strain H37Rv 
C; Genetics : 
A; Gene: aroP2 

C; Super family: arginine permease 

Query Match 5.8%; Score 83.5; DB 2; Length 487; 

Best Local Similarity 26.5%; Pred. No. 19; 

Matches 27; Conservative 16; Mismatches 40; Indels 19; Gaps 5; 

Qy 159 ETHFTGNEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGF 218 

: | : | : | : I : : I : I I I I I II I I 

Db 8 DERLTREDTGYHKGLHSRQLQMIALGGAIGTGLFLG — AGGRLASAGPGL FLVYGI 61 

Qy 219 CGIGSLIDFILISMQIVG PSDGS — SYIIDYYGTRL 252 

Ml | : : : : : I I I I I II : : I I : : 

Db 62 CGI FVFLILRALGELVLHRPSSGSFVSYAREFYGEKV 98 



RESULT 12 
B75447 

hypothetical protein - Deinococcus radiodurans (strain Rl) 
C; Species: Deinococcus radiodurans 

C;Date: 03-Dec-1999 #sequence_revision 03-Dec-1999 #text_change 31-Mar-2000 
C; Accession: B75447 

R;White, 0.; Eisen, J.A. ; Heidelberg,, J.F.; Hickey, E.K.; Peterson, J.D.; 
Dodson, R.J.; Haft, D.H.; Gwinn, M.L.; Nelson, W.C.; Richardson, D.L.; Moffat, 
K.S.; Qin, H.; Jiang, L.; Pamphile, W. ; Crosby, M. ; Shen, M. ; Vamathevan, J. J.; 
Lam, P.; McDonald, L. ; Utterback, T.; Zalewski, C. ; Makarova, K.S.; Aravind, L.; 
Daly, M.J.; Minton, K*W. ; Fleischmann, R.D.; Ketchum, K.A.; Nelson, K.E.; 
Salzberg, S.; Smith, H.O.; Venter, J.C.; Fraser, CM. 
Science 286, 1571-1577, 1999 

A;Title: Genome sequence of the radioresistant bacterium Deinococcus radiodurans 
Rl. 

A; Reference number: A75250; MUID : 20036896; PMID : 10567266 
A; Accession: B75447 
A; Status: preliminary 
A;Molecule type: DNA 
A; Residues: 1-137 <WHI> 

A;Cross-references: GB:AE001954; GB:AE000513; NID : g6458751; PIDN: AAF10608 . 1; 

PID:g6458763; TIGR:DR1033; GSPDB : GN00077 

A; Experimental source: strain Rl 

C; Genetics : 

A;Gene: DR1033 

A;Map position: 1 

Query Match 5.8%; Score 83; DB 2; Length 137; 

Best Local Similarity 26.5%; Pred. No. 4.9; 

Matches 39; Conservative 16; Mismatches 48; Indels 44; Gaps 8; 



Qy 



33 PFKNL ALLPFSLPLLGGGGSGSGEKVSVSKMAAAWPSGPSAP-EAVTARLVGVLWFV 88 



Ill II I I I I I I : I : I : I [ I : I I II 

Db 17 PIKKLLPWLLASVLTACGGGTSTPG TSTPNTPAVPSSAVAPKLSG FV 64 

Qy 89 SVT TGPWGAVATSAGGEESLKCEDLKVGQYICKDPKINDATQEPVNCTNYT 139 

I : I | | | | | : : I : I I 
Db 65 LSGSQHSLTVSLNAPASCVFNSAAGSLNMTAATLEGSPYA— YA 106 

Qy 140 AHVS-CFPAPNITCKDSSGNETHFTGN 165 

: I : I : : | | : I : I : : I II 
Db 107 VSLSGSYPKASVTCTNSAGSDTLSLGN 133 



RESULT 13 
S32521 

alcohol dehydrogenase (EC 1.1.1.1) 1- yeast (Kluyveromyces marxianus var. 
marxianus) 

C; Species: Kluyveromyces marxianus var. marxianus, Candida kefyr 

C;Date: 10-Sep-1999 #sequence_revision 10-Sep-1999 #text_change 28-Jul-2000 

C;Accession: S32521 

R;Ladriere, J.M.; Delcour, J.; Vandenhaute, J. 
Biochim. Biophys . Acta 1173, 99-101, 1993 

A; Title: Sequence of a gene coding for a cytoplasmic alcohol dehydrogenase from 
Kluyveromyces marxianus ATCC 12424. 

A; Reference number: S32521; MUID : 93250057 ; PMID: 8485163 
A;Accession: S32521 
A; Molecule type: DNA 
A; Residues: 1-348 <LAD> 

A;Cross-references: EMBL:X60224; NID: g6822201 ; PIDN : CAA42785 . 1 ; PID:g297908 
C; Genetics : 
A; Gene: ADH1 

C; Superf amily: alcohol dehydrogenase; long-chain alcohol dehydrogenase homology 
C; Keywords: alcohol metabolism; metalloprotein; NAD; oxidoreductase; zinc 
F;29-336/Domain: long-chain alcohol dehydrogenase homology <LADH> 
F;44, 67, 154/Binding site: zinc, catalytic (Cys, His, Cys) #status predicted 
F;98, 101, 104, 112/Binding site: zinc, noncatalytic (Cys) #status predicted 

Query Match 5.8%; Score 83; DB 1; Length 348; 

Best Local Similarity 20.8%; Pred. No. 14; 

Matches 65; Conservative 38; Mismatches 107; Indels 102; Gaps 14; 

Qy 19 NTRRDG TGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSVSKMAAAWPSG 7 0 

I : | ||: : I : I I I I : II hi I : : : I I 

Db 37 NVKYSGVCHTDLHAWQGDWP LDTKLPLV-GGHEGAGIWAMGENVTGWEIGDYAGI 91 

Qy 71 PSAPEA VTARLV — 82 

I : hi III 
Db 92 KWLNGSCMSCEECELSNEPNCPKADLSGYTHDGSFQQYATADAVQAARIPKNVDLAEVAP 151 

Qy 83 GV LWFVSVTTGPWGAVATSAGGEESLKCEDLKV 115 

II I : I I I : : : I I II : I 

Db 152 ILCAGVTWKALKSAHIKAGDWVAISGACGGLGSLAIQYAKAMGYRVLGIDAGDEKAKLF 211 

Qy 116 GQYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNE — VGF 169 

hi II I : I I I I : : : I I I I I 

Db 212 KELGGEYFIDFTKTKDMVAEVIEATNGVAHAVINVSVSEAAISTSVLYTRSNGTWLVGL 271 



QY 



170 FKPISCRNVNGYSYKVAVALSLFLGWLG— AD-RFYLGYPALGLLK— FCTVGFCGIGSL 224 



: |: : :| ::|: : : I I I I I : : I I : I :l : I: 

Db 272 PRDAQCK — SDVFNQWKSISIVGSYVGNRADTREALDFFSRGLVKAPIKILGLSELASV 329 

Qy 225 IDFILISMQIVG 236 

I : : I I II 
Db 330 YD-KMVKGQIVG 340 



RESULT 14 
BBHU 

complement factor B precursor [validated] - human 

N;Alternate names: C3 convertase; C3 proactivator ; glycine-rich beta- 
glycoprotein; heat-labile complement factor; proenzyme factor B; properdin 
factor B 

N;Contains: alternative-complement-pathway C3/C5 convertase (EC 3.4.21.47) Bb 
fragment 

C; Species: Homo sapiens (man) 

C;Date: 19-Feb-1984 #sequence_revision 05-Aug-1994 #text_change 08-Dec-2000 
C;Accession: S34075; A44622; A00934; A19188; A19947; B19947; B25971; S14339; 
A44628; 154409; 157824; B19447 

R;Mejia, J.E.; Jahn, I.; de la Salle, H.; Hauptmann, G. 

submitted to the EMBL Data Library, March 1993 

A; Reference number: S34075 

A; Accession: S34075 

A; Molecule type: mRNA 

A; Residues: 1-764 <MEJ> 

A; Cross-references: EMBL:X72875; NID:g297568; PIDN : CAA51389 . 1 ; PID:g297569 
R;Woods, D.E.; Markham, A.F.; Ricker, A.T.; Goldberger, G. ; Colten, H.R. 
Proc. Natl. Acad. Sci. U.S.A. 79, 5661-5665, 1982 

A; Title: Isolation of cDNA clones for the human complement protein factor B, a 
class III major histocompatibility complex gene product. 
A; Reference number: A44622; MUID: 83039428 ; PMID: 6957884 
A;Accession: A44622 
A; Molecule type: mRNA 

A;Residues: 467-546; 550-595 ; 752-764 <WOO> 
A; Cross-references: GB:J00185; GB:J00186 

A;Note: the authors translated the codon TAC at 519 as Thr; the nucleic acid . 
translation differs from the sequence shown in having 537-Thr, and 764-His 
R;Mole, J.E.; Anderson, J.K.; Davison, E.A. ; Woods, D.E. 
J. Biol. Chem. 259, 3407-3412, 1984 

A; Title: Complete primary structure for the zymogen of human complement factor 
B. 

A; Reference number: A20751; MUID: 84161997 ; PMID: 6546754 

A; Accession: A00934 

A;Molecule type: protein; mRNA 

A; Residues: 26-764 <MOL> 

A; Cross-references : GB: K01566 

A;Note: nucleic acid translation differs from the sequence shown in having 300- 

Leu, 328-Val, 356-Glu, and 357-Glu 

A;Note: 736-Ser was also found 

A;Note: glycosylation sites were determined 

R;Christie, D.L.; Gagnon, J. 

Biochem. J. 209, 61-70, 1983 

A; Title: Amino acid sequence of the Bb fragment from complement factor B. 
Sequence of the major cyanogen bromide-cleavage peptide (CB-II) and completion 
of the sequence of the Bb fragment. 

A; Reference number: A19188; MUID: 83204002 ; PMID: 6342610 



A; Contents: the final paper in a series documenting the sequence, glycosylation 

site, and active site 

A; Accession : A19188 

A;Molecule type: protein 

A;Residues: 260-296, ' T 298-764 <CHR> 

R; Campbell, R.D.; Porter, R.R. 

Proc. Natl. Acad. Sci. U.S.A. 80, 4464-4468, 1983 

A; Title: Molecular cloning and characterization of the gene coding for human 
complement protein factor B. 

A; Reference number: A19947; MUID : 83273641 ; PMID: 6308626 

A;Accession: A19947 

A;Molecule type: DNA 

A; Residues: 346-764 <CAM> 

A; Cross-ref erences : GB:J00125 

A; Accession: B19947 

A;Molecule type: mRNA 

A; Residues: 339-509 <CA1> 

A; Cross-references : GB:J00126; NID:gl87723; PIDN :AAA36226 . 1 ; PID:g553536 
R;Wu, L.; Morley, B.J.; Campbell, R.D. 
Cell 48, 331-342, 1987 

A;Title: Cell-specific expression of the human complement protein factor B gene 

evidence for the role of two distinct 5 '-flanking elements. 

A; Reference number: A25971; MUID : 87102880 ; PMID: 3643061 

A; Accession: B25971 

A; Molecule type: DNA 

A; Residues: 1-99 <WUL> 

A; Cross-references: GB:M15082; NID:gl87699; PIDN :AAA5 9625 . 1; PID:g553534 
R;Niemann, M.A. ; Bhown, A.S.; Miller, E.J. 
Biochem. J. 274, 473-480, 1991 

A; Title: The principal site of glycation of human complement Factor B. 
A; Reference number: S14339; MUID : 91174758 ; PMID:2006911 
A; Accession: S14339 
A;Molecule type: protein 
A; Residues: 270-329 <NIE> 

A; Note: binding site for carbohydrate to lysine under artificial conditions 
R;Morley, B.J.; Campbell, R.D. 
EMBO J. 3, 153-157, 1984 

A; Title: Internal homologies of the Ba fragment from human complement component 
factor B, a class III MHC antigen. 

A; Reference number: A44628; MUID : 84158524 ; PMID: 6323161 

A; Accession: A44 62 8 

A; Status: preliminary 

A;Molecule type: mRNA 

A; Residues: 16-225, 1 F 1 ,227-259 <MOR> 

R;Schwaeble, W. ; Luttig, B.; Sokolowski, T.; Estaller, C. ; Weiss, E.H.; Meyer 
zum Buschenfelde, K.H.; Whaley, K. ; Dippold, W. 
Immunobiology 188, 221-232, 1993 

A; Title: Human complement factor B: functional properties of a recombinant 
zymogen of the alternative activation pathway convertase. 
A; Reference number: 154409; MUID : 94041399 ; PMID: 8225386 
A; Accession: 154409 

A; Status: translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-764 <RES> 

A; Cross-references: GB:S67310; NID:g452937; PIDN : AAD13989 . 1 ; PID:g4261689 
R;Horiuchi, T.; Kim, S . ; Matsumoto, M. ; Watanabe, I.; Fujita, S . ; Volanakis, 
J.E. 



Mol. Immunol. 30, 1587-1592, 1993 

A; Title: Human complement factor B: cDNA cloning, nucleotide sequencing, 
phenotypic conversion by site-directed mutagenesis and expression. 
A;Reference number: 157824; MUID : 94067177 ; PMID:8247029 
A;Accession: 157824 

A; Status: translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-31, 1 Q 1 , 33-764 <RE2> 

A; Cross-references: GB:L15702; NID:g291921; PIDN : AAA16820 . 1 ; PID:g291922 

C;Comment: 292-Cys has a free sulfhydryl. 

C; Genetics : 

A; Gene: GDB : BF 

A/Cross-references: GDB: 119726; OMIM: 138470 
A;Map position: 6p21 . 3-6p21 . 3 

A;Introns: 21/3; 99/3; 346/1; 390/1; 424/1; 470/1; 502/3; 542/1; 593/2; 619/1; 
652/3; 697/1; 713/3 

A;Note: the list of introns may be incomplete 

A; Note: gene is located in the major histocompatibility complex, class III 
region 

C; Complex: complement factor B initially forms an inactive complex with 
complement factor C3b, becoming susceptible to cleavage by factor D into Ba and 
Bb fragments; Bb remains associated with complement factor C3b forming active 
C3/C5 convertase; Ba is released 
C; Function: 

A; Description: Bb is a serine proteinase; C3/C5 convertase cleaves complement C3 
alpha chain to release C3a and form C3b; it also cleaves C5 alpha chain to 
release C5a and form C5b; Ba is nonfunctional 
A; Pathway: complement alternate pathway 

C; Super family: complement C2; complement factor H repeat homology; trypsin 

homology; von Willebrand factor type A repeat homology 

C; Keywords: acute phase; complement alternate pathway; duplication; 

glycoprotein; hydrolase; plasma; serine proteinase 

F; 1-25/Domain: signal sequence #status predicted <SIG> 

F;26-764/Product : complement factor B #status experimental <MAT> 

F;26-259/Product : complement factor Ba fragment #status experimental <BAF> 

F; 37-98/Domain: complement factor H repeat homology <FH1> 

F; 10 3- 15 8 /Domain : complement factor H repeat homology <FH2> 

F; 165-218/Domain: complement factor H repeat homology <FH3> 

F;260-764/Product: C3/C5 convertase Bb fragment #status experimental <BBF> 

F;268-458/Domain: von Willebrand factor type A repeat homology <VFA> 

F; 4 8 2- 7 52 /Domain: trypsin homology #status atypical <TRY> 

F; 37-76, 62-98, 103-145, 131-158, 165-205, 191-218,478-596, 511-527, 599-615, 656- 
682, 695-725/Disulfide bonds: #status predicted 

F;122, 142,285, 378/Binding site: carbohydrate (Asn) (covalent) #status 
experimental 

F;259-260/Cleavage site: Arg-Lys (complement factor D) #status experimental 
F;526, 576, 699/Active site: His, Asp, Ser #status experimental 

Query Match 5.8%; Score 83; DB 1; Length 764; 

Best Local Similarity 24.1%; Pred. No. 34; 

Matches 49; Conservative 21; Mismatches 71; Indels 62; Gaps 12; 

Qy 24 GTGLYPMRGPFKNLALLPFSLPLLGGG GSGSGEKVSV 60 

I : I I I I : I I I I I I I Mill: 

Db 2 GSNLSP QLCLMPFILGLLSGGVTTTPWSLARPQGSCSLEGVEIKGGSFRLLQEG 55 



Qy 



61 S KMAAAW P S G — P SAP EAVT ARLVGVLW FVS VTT G PWGAVAT S AGGEESLKC — 110 



: I I I I : I I :| I I : I - I 

Db 56 QALEYVCPSGFYPYPVQTRTCR STGSWSTLKTQDQKTVRKAECRAIHCPR 105 

Qy HI -EDLKVGQYICKDPKINDATQEPVNC-TNYTAHVSCFPAPNITCKDSS — GNETHFTGNE 166 

| : I : I : I I : : : I II I III:: : I I 

Db 106 PHDFENGEYWPRS P YYNVSDEI S FHCYDGYTLRGSA NRTCQVNGRWS GQTAI CDNG 161 

Qy 167 VGFFK PISCRNVNGYSYKV 185 

I : II III I : : 

Db 162 AGYCSNPGIPIGTRKV-GSQYRL 183 



RESULT 15 
T10729 

transferrin-like protein Ttf-1, salt-induced - green alga (Dunaliella salina) 
C; Species: Dunaliella salina 

C;Date: 16-Jul-1999 #sequence_revision 16-Jul-1999 #text_change 16-Jul-1999 
C;Accession: T10729 

R; Fisher, M. ; Gokhman, I.; Pick, U. ; Zamir, A. 
submitted to the EMBL Data Library, November 1996 
A;Reference number: Z17101 
A;Accession: T10729 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1274 <FIS> 

A;Cross-references: EMBL:U77059; NID: gl684791 ; PID:gl684792 
C; Genetics : 
A; Gene: ttfl 

C; Superf amily: transferrin repeat homology 

Query Match 5.8%; Score 83; DB 2; Length 1274; 

Best Local Similarity 22.9%; Pred. No. 61; 

Matches 30; Conservative 17; Mismatches 50; Indels 34; Gaps 6 

Qy 57 KVS VS KMAAAW P S GP SAP EAV- TARLVGVLW FVS VT T G PWGAVAT S AGGEE S LKC ED L KV 115 

: | : I II: I : I I : I I : I I 
Db 579 QVDAETIEKFWEDNVCAPGSTENGPLIG GGKYGEVGENGGG 619 

Qy 116 GQYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISC 175 

:|| | : :::| I II I I I : ' I I I : : I I : 

Db 620 LCKRCKTDCTSEDPY — AGYDGAVHCI DDDDGNQ — FTGGDIAFVKHSTL 665 

Qy 176 RNVNGYSYKVA 186 

I : I I : I 
Db 666 RDYNGPNLNTA 676 



Search completed: March 4, 2004, 10:27:47 
Job time : 4 6 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: March 4, 2004, 10:26:56 ; 



Search time 652 Seconds 

(without alignments) 

87.117 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



US-09-852-100B-2 
1439 

1 MHI LKGS PNVI PRAHGQKNT TRLTRLSITNETFRKTQLYP 2 69 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



809742 seqs, 211153259 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



809742 



Database 



Published_Applications_AA: * 

/cgn2_6/ptodata/2/pubpaa/US07_PUBCOMB.pep:* 
/cgn2_6/ptodata/2/pubpaa/PCT_NEW_PUB.pep:* 
/ cgn2_6/ptodata/2 /pubpaa/US 0 6 JtfEW_PUB. pep:* 
/ cgn2_6/ptodata/2 /pubpaa/US 06_PUBCOMB. pep:* 
/cgn2_6/ptodata/2/pubpaa/US07_NEW_PUB . pep : * 
/cgn2_6/ptodata/2/pubpaa/PCTUS_PUBCOMB . pep : * 
/cgn2_6/ptodata/2/pubpaa/US08_NEW_PUB.pep:* 
/ cgn2_6/ptodata/2 /pubpaa/US 0 8_PUBCOMB . pep : * 
/ cgn2_6/ptodata/2 /pubpaa/US 0 9A_PUBCOMB . pep : * 
/cgn2_6/ptodata/2/pubpaa/US09B_PUBCOMB.pep: 
/cgn2__6/ptodata/2/pubpaa/US09C_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/US09_NEW_PUB.pep: * 
/cgn2_6/ptodata/2/pubpaa/US10A_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/USlOC_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/US10_NEW_PUB.pep: > ' 
/cgn2_6/ptodata/2/pubpaa/US60_NEW_PUB . pep : ' 
/cgn2_6/ptodata/2/pubpaa/US60_PUBCOMB.pep:^ 
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17 
18 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 

US-09-852-100A-2 

; Sequence 2, Application US/09852100A 

; Patent No. US20020058267A1 

; GENERAL INFORMATION: 

; APPLICANT: American Home Products 



TITLE OF INVENTION: Beta-amyloid Peptide-Binding Proteins and Polynucleotides 
Encoding the 

TITLE OF INVENTION: Same 
FILE REFERENCE: AHP981261p2 

CURRENT APPLICATION NUMBER: US/ 09/ 852 , 100A 
CURRENT FILING DATE: 2001-05-09 
PRIOR APPLICATION NUMBER: US 09/172,990 
PRIOR FILING DATE: 1998-10-14 
PRIOR APPLICATION NUMBER: US 60/104,104 
PRIOR FILING DATE: 1998-10-13 
PRIOR APPLICATION NUMBER: PTC/US99/21621 
PRIOR FILING DATE: 1999-10-13 
PRIOR APPLICATION NUMBER: US 09/060,609 
PRIOR FILING DATE: 1998-04-15 
PRIOR APPLICATION NUMBER: US 60/064,583 
PRIOR FILING DATE: 1997-04-16 
NUMBER OF SEQ ID NOS : 2 
SOFTWARE: Patentln version 3.0 
SEQ ID NO 2 
LENGTH: 269 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-852-100A-2 

Query Match 100.0%; Score 1439; DB 9; Length 269; 

Best Local Similarity 100.0%; Pred. No. 3e-135; 

Matches 2 69; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 

SKMAAAWPSGPSAPEAWARLVGVLWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYIC 120 

I I I I I I II I I I I I I I I I II I I I I I I II I I I I I I I I I I II I I I I I I I I M I II I I I I I I I I 
SKMAAAWPSGPSAPEAVTARLVGVLWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYIC 120 

KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I M I I I I I I > I I I 
KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 18 0 

YSYKVAVALSLFLGWLG7UDRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 
YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

SS YI I DYYGTRLTRLS ITNETFRKTQLYP 2 69 

I I I I II I I II I I I I I I I I I I I I I I I I I I I 
S S Y 1 1 D Y YGT RLT RL S I TN ET FRKTQ L Y P 269 
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RESULT 2 

US-09-833-503A-2 

; Sequence 2, Application US/09833503A 

; Patent No. US20020146760A1 

; GENERAL INFORMATION: 

; APPLICANT: Ozenberger, Bradley A 

; APPLICANT: Kajkowski, Eileen M 

; APPLICANT: Lo, Ching-Hsiung F 



; APPLICANT: American Home Products Corporation 

; TITLE OF INVENTION: No. US200201467 60Alel G-Protein-Coupled Receptor-Like 
Proteins and 

; TITLE OF INVENTION: Polynucleotides Encoded By Them, and Methods of Using 

TITLE OF INVENTION: Same 
; FILE REFERENCE: AHP98165-00PCT 
; CURRENT APPLICATION NUMBER: US/09/833, 503A 
; CURRENT FILING DATE: 2000-10-13 
; PRIOR APPLICATION NUMBER: 60/104,104 
; PRIOR FILING DATE: 1998-10-13 
; NUMBER OF SEQ ID NOS : 6 
; SOFTWARE: Patent In Ver. 2.1 
; SEQ ID NO 2 

LENGTH: 2 69 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-833-503A-2 

Query Match 100.0%; Score 1439; DB 9; Length 269; 

Best Local Similarity 100.0%; Pred. No. 3e-135; 

Matches 269; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I 
Db 1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNL7VLLPFSLPLLGGGGSGSGEKVSV 60 

Qy 61 S KMAAAW P S G P SAP EAVT ARLVGVLW FVS VTT G P WGAVAT S AGG E E S L KC ED LKVGQ Y I C 120 

I I I I I I I 11 I I I I I I I I I I I I I I 1 I I I I I I I I 1 I I I I I I I I M 1 I I I I I I i I I I I I I I I I 
Db 61 SKMAAAWPSGPSAPEAVTARLVGVLWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYIC 120 

Qy 121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 18 0 

I I II I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
Db 121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 180 

Qy 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 24 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

Qy 241 SSYIIDYYGTRLTRLSITNETFRKTQLYP 269 

I I I I I I I I I I II I I I I I I I I I I I I I I I II 
Db 241 SSYIIDYYGTRLTRLSITNETFRKTQLYP 269 



RESULT 3 
US-10-199-881-2 

; Sequence 2, Application US/10199881 
; Publication No. US20030096356A1 
; GENERAL INFORMATION: 
; APPLICANT: Wyeth 

; TITLE OF INVENTION: No. US20030096356Alel G-Protein-Coupled Receptor-Like 
Proteins and Polynucleotides 

; TITLE OF INVENTION: Encoded by Them, and Methods of Using Same" 
; FILE REFERENCE: AHP98165C1 

; CURRENT APPLICATION NUMBER: US/ 10/ 199 , 88 1 
; CURRENT FILING DATE: 2002-07-18 
; PRIOR APPLICATION NUMBER: PCT/ US99/21621 
; PRIOR FILING DATE: 1999-10-13 



; PRIOR APPLICATION NUMBER: US 90/833,5081 

; PRIOR FILING. DATE: 2001-12-04 

; PRIOR APPLICATION NUMBER: US 60/104,104 

; PRIOR FILING DATE: 1998-10-13 

; NUMBER OF SEQ ID NOS : 45 

; SOFTWARE: Patentln version 3.1 

; SEQ ID NO 2 

LENGTH: 269 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-199-881-2 

Query Match 100.0%; Score 1439; DB 14; Length 269; 

Best Local Similarity 100.0%; Pred. No. 3e-135; 

Matches 269; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MHILKGSPNVIPRAHGQKNTRRDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MHI LKGS PNVI PRAHGQKNTRRDGTGLYPMRGP FKNLALLP FS LPLLGGGGSGS GEKVS V 60 

Qy 61 S KMAAAWPS GP SAPEAVTARLVGVLWFVS VTTGPWGAVAT S AGGEES LKCEDLKVGQYI C 120 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I >> 
Db 61 SKMAAAWPSGPSAPEAWARLVGVTWFVSWTGPWGAVATSAGGEESLKCEDLKVGQYIC 120 

Qy 121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 121 KDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNG 180 

Qy 181 YSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDG 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I 
Db 181 YS YKVAVALS LFLGWLGADRFYLGYPALGLLKFCTVGFCGI GSLI DFI LI SMQI VGP S DG 240 

Qy 241 SSYIIDYYGTRLTRLSITNETFRKTQLYP 269- 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 241 SSYIIDYYGTRLTRLSITNETFRKTQLYP 269 



RESULT 4 

US-09-974-879-230 

; Sequence 230, Application US/09974879 

; Publication No. US20030028003A1 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

TITLE OF INVENTION: 125 Human Secreted Proteins 

FILE REFERENCE: PZ020P2 
; CURRENT APPLICATION NUMBER: US/09/974, 879 
; CURRENT FILING DATE: 2001-10-12 
; PRIOR APPLICATION NUMBER: US 60/239,893 
; PRIOR FILING DATE: 2000-10-13 
; PRIOR APPLICATION NUMBER: US 09/818,683 
; PRIOR FILING DATE: 2001-03-28 
; PRIOR APPLICATION NUMBER: US 09/305,736 
; PRIOR FILING DATE: 1999-05-05 
; PRIOR APPLICATION NUMBER: PCT/US98/23435 
; PRIOR FILING DATE: 1998-11-04 
; PRIOR APPLICATION NUMBER: US 60/064,911 
; PRIOR FILING DATE: 1997-11-07 



PRIOR APPLICATION NUMBER: US 60/064,912 
PRIOR FILING DATE: 1997-11-07 
PRIOR APPLICATION NUMBER: US 60/064,983 
PRIOR FILING DATE: 1997-11-07 
PRIOR APPLICATION NUMBER: US 60/064,900 
PRIOR FILING DATE: 1997-11-07 
PRIOR APPLICATION NUMBER: US 60/064,988 
PRIOR FILING DATE: 1997-11-07 
PRIOR APPLICATION NUMBER: US 60/064,987 
PRIOR FILING DATE: 1997-11-07 
PRIOR APPLICATION NUMBER: US 60/064,908 
PRIOR FILING DATE: 1997-11-07 
PRIOR APPLICATION NUMBER: US 60/064,984 
PRIOR FILING DATE: 1997-11-07 
PRIOR APPLICATION NUMBER: US 60/064,985 
PRIOR FILING DATE: 1997-11-07 
PRIOR APPLICATION NUMBER: US 60/066,094 
PRIOR FILING DATE: 1997-11-17 
PRIOR APPLICATION NUMBER: US 60/066,100 
PRIOR FILING DATE: 1997-11-17 
PRIOR APPLICATION NUMBER: US 60/066,089 
PRIOR FILING DATE: 1997-11-17 
PRIOR APPLICATION NUMBER: US 60/066,095 
PRIOR FILING DATE: 1997-11-17 
PRIOR APPLICATION NUMBER: US 60/066,090 
PRIOR FILING DATE: 1997-11-17 
NUMBER OF SEQ ID NOS : 611 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 230 
LENGTH: 221 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 
NAME /KEY: SITE 
LOCATION: (184) 

OTHER INFORMATION: Xaa equals any of the naturally occurring L- amino acids 
US-09-974-879-230 

Query Match 14.0%; Score 201; DB 10; Length 221; 

Best Local Similarity 45.7%; Pred. No. 8.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

III I : III : I I I I : I I II I I : I I I : I : I 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT-FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I I I I I I II I : I I I : I I I : I I : I I I : I II I I 

Db 166 LSITLGGFGADRFYLGQWXEGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 5 

US-09-305-736-230 

; Sequence 230, Application US/09305736 
; Publication No. US2003008807 8A1 
; GENERAL INFORMATION: 
; APPLICANT: Feng et al . 



TITLE OF INVENTION: 125 Human Secreted Proteins 
FILE REFERENCE: PZ020P1 

CURRENT APPLICATION NUMBER: US/09/305, 736 
CURRENT FILING DATE: 1999-05-05 
EARLIER APPLICATION NUMBER: PCT/US98/23435 
EARLIER FILING DATE: 1998-11-04 
EARLIER APPLICATION NUMBER: 60/064,911 
EARLIER FILING DATE: 1997-11-07 
EARLIER APPLICATION NUMBER: 60/064,912 
EARLIER FILING DATE: 1997-11-07 
EARLIER APPLICATION NUMBER: 60/064,983 
EARLIER FILING DATE: 1997-11-07 
EARLIER APPLICATION NUMBER: 60/064,900 
EARLIER FILING DATE: 1997-11-07 
EARLIER APPLICATION NUMBER: 60/064,988 
EARLIER FILING DATE: 1997-11-07 
EARLIER APPLICATION NUMBER: 60/064,987 
EARLIER FILING DATE : 1997-11-07 
EARLIER APPLICATION NUMBER: 60/064,908 
EARLIER FILING DATE: 1997-11-07 
EARLIER APPLICATION NUMBER: 60/064,984 
EARLIER FILING DATE: 1997-11-07 
EARLIER APPLICATION NUMBER: 60/064,985 
EARLIER FILING DATE: 1997-11-07 
EARLIER APPLICATION NUMBER: 60/066,094 
EARLIER FILING DATE: 1997-11-17 
EARLIER APPLICATION NUMBER: 60/066,100 
EARLIER FILING DATE: 1997-11-17 
EARLIER APPLICATION NUMBER: 60/066,089 
EARLIER FILING DATE: 1997-11-17 
EARLIER APPLICATION NUMBER: 60.066,095 
EARLIER FILING DATE: 1997-11-17 
EARLIER APPLICATION NUMBER: 60/066,090 
EARLIER FILING DATE: 1997-11-17 
NUMBER OF SEQ ID NOS : 612 
SOFTWARE: PatentlnVer. 2.0 
SEQ ID NO 230 
LENGTH: 222 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 
NAME/ KEY: SITE 
LOCATION: (184) 

OTHER INFORMATION: Xaa equals any of the naturally occurring L- amino acids 
FEATURE : 
NAME/ KEY: SITE 
LOCATION: (222) 

OTHER INFORMATION: Xaa equals stop translation 
US-09-305-736-230 

Query Match 14.0%; Score 201; DB 10; Length 222; 

Best Local Similarity 45.7%; Pred. No. 8.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

Ml | : III : I I I I : I I I I MM I I : Ml 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 



Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

||: II MINIM III : I M MM Ml : IMMM M 
Db 166 LSITLGGFGADRFYLGQWXEGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 6 

US-09-818-683-230 

; Sequence 230, Application US/09818683 

; Publication No. US20030211472A1 

; GENERAL INFORMATION: 

; APPLICANT: Feng et al. 

TITLE OF INVENTION: 125 Human Secreted Proteins 
; FILE REFERENCE: PZ020P1 

; CURRENT APPLICATION NUMBER: US/ 09/818 , 683 
; CURRENT FILING DATE: 2001-03-28 

Prior application data removed - consult PALM or file wrapper 
; NUMBER OF SEQ ID NOS : 612 
; SOFTWARE: PatentlnVer. 2.0 
; SEQ ID NO 230 
LENGTH: 222 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
FEATURE: 
NAME/KEY: SITE 
LOCATION: (184) 

; OTHER INFORMATION: Xaa equals any of the naturally occurring L- amino acids 
; FEATURE : 
; NAME/ KEY: SITE 
LOCATION: (222) 
; OTHER INFORMATION: Xaa equals stop translation 
US-09-818-683-230 

Query Match 14.0%; Score 201; DB 11; Length 222; 

Best Local Similarity 45.7%; Pred. No. 8.4e-12; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA— HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

Ml |: Ml :|| I I :| IN MM M : Ml 

Db 112 CTN ST S CMT VS C P RQRYP A-NCT VRD HVHCLGNRT-FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

||: II llllllll M I : I M MM Ml : IMMM II 

Db 166 LSITLGGFGADRFYLGQWXEGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 7 

US-09-833-503A-6 

; Sequence 6, Application US/09833503A 

; Patent No. US2 002014 67 60A1 

; GENERAL INFORMATION: 

; APPLICANT: Ozenberger, Bradley A 

; APPLICANT: Kajkowski, Eileen M 

; APPLICANT: Lo, Ching-Hsiung F 

; APPLICANT: American Home Products Corporation 

; TITLE OF INVENTION: No. US20020146760Alel G-Protein-Coupled Receptor-Like 
Proteins and 



; TITLE OF INVENTION: Polynucleotides Encoded By Them, and Methods of Using 

; TITLE OF INVENTION: Same 

; FILE REFERENCE: AHP98165-00PCT 

; CURRENT APPLICATION NUMBER: US/09/833, 503A 

; CURRENT FILING DATE: 2000-10-13 

; PRIOR APPLICATION NUMBER: 60/104,104 

; PRIOR FILING DATE: 1998-10-13 

; NUMBER OF SEQ ID NOS : 6 

SOFTWARE: Patent In Ver. 2.1 
; SEQ ID NO 6 
; LENGTH: 221 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-833-503A-6 

Query Match 13.9%; Score 200; DB 9; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

Ml I : II I : I I I I : I I II I I : I M : I : I 

D b 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

||:ll 11 III : I II :IM :M : 111 = 111 M 

Db 166 LSITLGGFGADRFYLGQWREGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 8 

US-09-992-600A-82 

; Sequence 82, Application US/09992600A 

; Publication No. US20030027161A1 

; GENERAL INFORMATION: 

; APPLICANT: Benjanin, Stephane 

; APPLICANT: Tanaka, Hiroaki 

; TITLE OF INVENTION: HUMAN CDNAS AND PROTEINS AND USES THEREOF 
; FILE REFERENCE: 91.US4.DIV 

; CURRENT APPLICATION NUMBER: US/ 09/992 , 600A 

; CURRENT FILING DATE: 2001-11-13 

; PRIOR APPLICATION NUMBER: US 09/924,340 

PRIOR FILING DATE: 2001-08-06 
; PRIOR APPLICATION NUMBER: PCT/IB01/01715 
; PRIOR FILING DATE: 2001-08-06 

PRIOR APPLICATION NUMBER: US 60/305,456 
; PRIOR FILING DATE: 2001-07-13 
; PRIOR APPLICATION NUMBER: US 60/302,277 
; PRIOR FILING DATE: 2001-06-29 
; PRIOR APPLICATION NUMBER: US 60/298,698 
; PRIOR FILING DATE: 2001-06-15 
; PRIOR APPLICATION NUMBER: US 60/293,574 
; PRIOR FILING DATE: 2001-05-25 
; NUMBER OF SEQ ID NOS: 114 

SOFTWARE: JPatent 
; SEQ ID NO 82 
LENGTH: 221 
TYPE: PRT 

ORGANISM: Homo sapiens 



FEATURE : 
; NAME /KEY: SIGNAL 
; LOCATION: 1 . . 32 
US-09-992-600A-82 

Query Match 13.9%; Score 200; DB 10; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5; 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNWGYSYKVAVA 188 

Ml I : III : M I I : I I II I I : I I I : I : I 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I I I I I I III : I I I : I I I : I I : I I I : I I I I I 

Db 166 LSITLGGFGADRFYLGQWREGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 9 

US-09-924-340-82 

; Sequence 82, Application US/09924340 

; Publication No. US20030027248A1 

; GENERAL INFORMATION: 

; APPLICANT: Bejanin, Stephane 

; APPLICANT: Tanaka, Hiroaki 

; TITLE OF INVENTION: HUMAN CDNAS AND PROTEINS AND USES THEREOF 
; FILE REFERENCE: 91.US2.REG 

; CURRENT APPLICATION NUMBER: US/ 09/924 , 340 

; CURRENT FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: US 60/305,456 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/302,277 

; PRIOR FILING DATE: 2001-06-29 

; PRIOR APPLICATION NUMBER: US 60/298,698 

; PRIOR FILING DATE: 2001-06-15 

; PRIOR APPLICATION NUMBER: US 60/293,574 

; PRIOR FILING DATE: 2001-05-25 

; NUMBER OF SEQ ID NOS : 112 

; SOFTWARE: JPatent 

; SEQ ID NO 82 

LENGTH: 221 

TYPE: PRT 
; ORGANISM: Homo sapiens 
; FEATURE : 

NAME/KEY: SIGNAL 
; LOCATION: 1..32 
US-09-924-340-82 

Query Match 13.9%; Score 200; DB 10; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 18 8 

Mil: III : I I I I : I I I I I I : I I I = hi 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT -FPKMLYCNWTGGYKWSTALA 165 



189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 



II* II I I I I I I I I II I • I It • I 1 111-... . . 

Db 166 LSITLGGFGADRFYLGQWREGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 10 
US-09-992-095B-82 

; Sequence 82, Application US/09992095B 

; Publication No. US2.0030157485A1 

; GENERAL INFORMATION: 

; APPLICANT: Benjanin, Stephane 

; APPLICANT: Tanaka, Hiroaki 

; TITLE OF INVENTION: HUMAN CDNAS AND PROTEINS AND USES THEREOF 
; FILE REFERENCE: 91.US5.DIV 

; CURRENT APPLICATION NUMBER: US/09/992, 095B 

; CURRENT FILING DATE: 2003-02-20 

; PRIOR APPLICATION NUMBER: US 09/924,340 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: PCT/IB01/017 15 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: US 60/305,456 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/302,277 

; PRIOR FILING DATE: 2001-06-29 

; PRIOR APPLICATION NUMBER: US 60/298,698 

; PRIOR FILING DATE: 2001-06-15 

; PRIOR APPLICATION NUMBER: US 60/293,574 

; PRIOR FILING DATE: 2001-05-25 

; NUMBER OF SEQ ID NOS : 112 

; SOFTWARE: JPatent 

; SEQ ID NO 82 

; LENGTH: 221 

; TYPE: PRT 

; ORGANISM: Homo sapiens 

FEATURE : 
; NAME/ KEY: SIGNAL 

LOCATION: 1 . . 32 
US-09-992-095B-82 

Query Match 13.9%; Score 200; DB 10; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA— HVSC FPAPN ITCKDS S GNETHFTGNEVGFFKP I S CRNVNGYS YKVAVA 188 

Ml I : III : I I I I : I I II I I : I I I : hi 

D b 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I I I I I I M I : I I I : I I I : I I : II I : I I I I I 

Db 166 LS I TLGGFGADRFYLGQWREGLGKLFS FGGLGI WTLI DVLLI GVGYVGPADGS L YI 221 



RESULT 11 
US-09-999-570-82 

; Sequence 82, Application US/09999570 

; Publication No. US20030170628A1 

; GENERAL INFORMATION: 

; APPLICANT: Benjanin, Stephane 



; APPLICANT: Tanaka, Hiroaki 

; TITLE OF INVENTION: HUMAN CDNAS AND PROTEINS AND USES THEREOF 

; FILE REFERENCE: G-091US08DIV 

; CURRENT APPLICATION NUMBER: US/09/999, 570 

; CURRENT FILING DATE: 2001-06-14 

; PRIOR APPLICATION NUMBER: US 09/924,340 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: PCT/IB01/01715 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: US 60/305,456 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/302,277 

; PRIOR FILING DATE: 2001-06-29 

; PRIOR APPLICATION NUMBER: US 60/298,698 

; PRIOR FILING DATE: 2001-06-15 

; PRIOR APPLICATION NUMBER: US 60/293,574 

; PRIOR FILING DATE: 2001-05-25 

; NUMBER OF SEQ ID NOS : 112 

; SOFTWARE: JPatent 

; SEQ ID NO 82 

LENGTH: 221 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
FEATURE: 

NAME /KEY: SI GNAL 
; LOCATION: 1..32 
US-09-999-570-82 

Query Match 13.9%; Score 200; DB 10; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

III I : III : M I I : I I II I I : I I I : I : I 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

II: II llllllll II I : I II :lll :ll : lll:lll II 

Db 166 LSITLGGFGADRFYLGQWREGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 12 
US-10-000-489-82 

; Sequence 82, Application US/10000489 

; Publication No. US20030092011A1 

; GENERAL INFORMATION: 

; APPLICANT: Benjanin, Stephane 

; APPLICANT: Tanaka, Hiroaki 

; TITLE OF INVENTION: HUMAN CDNAS AND PROTEINS AND USES THEREOF 
; FILE REFERENCE: 91.US6.DIV 

; CURRENT APPLICATION NUMBER: US/10/000, 489 

; CURRENT FILING DATE: 2001-11-14 

; PRIOR APPLICATION NUMBER: US 09/924,340 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: PCT/IB01/017 15 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: US 60/305,456 



; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/302,277 

; PRIOR FILING DATE: 2001-06-29 

; PRIOR APPLICATION NUMBER: US 60/298,698 

; PRIOR FILING DATE: 2001-06-15 

; PRIOR APPLICATION NUMBER: US 60/293,574 

; PRIOR FILING DATE: 2001-05-25 

; NUMBER OF SEQ ID NOS : 112 

SOFTWARE: JPatent 
; SEQ ID NO 82 

LENGTH: 221 
; TYPE: PRT 

; ORGANISM: Homo sapiens 

FEATURE : 
; NAME/KEY: SIGNAL 

LOCATION: 1..32 
US-10-000-489-82 



Query Match 13.9%; Score 200; DB 14; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

r 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

MM: Ml : II I I : I I I I I I : I I I : hi 

> 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT-FPKMLYCNWTGGYKWSTALA 165 

r 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

||:|| MINIM III : I M MM Ml : I I I : I I I II 

> 166 LS ITLGGFGADRFYLGQWREGLGKLFS FGGLGIWTLI DVLLI GVGYVGPADGSLYI 221 



RESULT 13 
US-10-000-986-82 

; Sequence 82, Application US/10000986 

; Publication No. US20030096247A1 

; GENERAL INFORMATION: 

; APPLICANT: Benjanin, Stephane 

; APPLICANT: Tanaka, Hiroaki 

; TITLE OF INVENTION: HUMAN CDNAS AND PROTEINS AND USES THEREOF 
; FILE REFERENCE: 91.US9.DIV 

; CURRENT APPLICATION NUMBER: US/10/000, 986 

; CURRENT FILING DATE: 2001-11-14 

; PRIOR APPLICATION NUMBER: US 09/924,340 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: PCT/IB01/01715 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: US 60/305,456 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/302,277 

; PRIOR FILING DATE: 2001-06-29 

; PRIOR APPLICATION NUMBER: US 60/298,698 

; PRIOR FILING DATE: 2001-06-15 

; PRIOR APPLICATION NUMBER: US 60/293,574 

; PRIOR FILING DATE: 2001-05-25 

; NUMBER OF SEQ ID NOS: 112 

; SOFTWARE: JPatent 

; SEQ ID NO 82 



LENGTH: 221 
TYPE: PRT 

ORGANISM: Homo sapiens 

FEATURE : 
; NAME/ KEY: SIGNAL 

LOCATION: 1 . . 32 
US-10-000-986-82 



Query Match 13.9%; Score 200; DB 14; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 18 8 

Ml I : III : I I I I : I I II I I : I I I : I : I 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 



Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGI GSLI DFI LI SMQI VGPSDGSS YI 244 

I I : I I I I I I I I I I III : I I I : I I I : I I : I I I : I I I I I 

Db 166 LS I TLGGFGADRFYLGQWREGLGKLFS FGGLGI WTLI DVLLI GVGYVGPADGS L YI 221 



RESULT 14 
US-10-199-881-6 

; Sequence 6, Application US/10199881 
; Publication No. US20030096356A1 
; GENERAL INFORMATION: 
; APPLICANT: Wyeth 

TITLE OF INVENTION: No. US20030096356Alel G-Protein-Coupled Receptor-Like 
Proteins and Polynucleotides 

TITLE OF INVENTION: Encoded by Them, and Methods of Using Same" 
; FILE REFERENCE: AHP98165C1 

; CURRENT APPLICATION NUMBER: US/10/199,881 

; CURRENT FILING DATE: 2002-07-18 

; PRIOR APPLICATION NUMBER: PCT/ US99/21621 

; PRIOR FILING DATE: 1999-10-13 

; PRIOR APPLICATION NUMBER: US 90/833,5081 

; PRIOR FILING DATE: 2001-12-04 

; PRIOR APPLICATION NUMBER: US 60/104,104 

; PRIOR FILING DATE: 1998-10-13 

; NUMBER OF SEQ ID NOS : 45 

; SOFTWARE: Patentln version 3.1 

; SEQ ID NO 6 

; LENGTH: 221 

; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-199-881-6 

Query Match 13.9%; Score 200; DB 14; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA— HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

Mil: III : I I I I : I I II I I : I I I : hi 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT-FPKMLYCNWTGGYKWSTALA 165 

Qy 18 9 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I | : I I I II II I II III : I I I : I I I : I I : I I I : I I I I I 



Db 166 LSITLGGFGADRFYLGQWREGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 15 
US-10-154-678-82 

; Sequence 82, Application US/10154678 

; Publication No. US20030162186A1 

; GENERAL INFORMATION: 

; APPLICANT: Benjanin, Stephane 

APPLICANT: Tanaka, Hiroaki 
; TITLE OF INVENTION: HUMAN CDNAS AND PROTEINS AND USES THEREOF 
; FILE REFERENCE: 182.US1.REG 

; CURRENT APPLICATION NUMBER: US/10/154, 678 

; CURRENT FILING DATE: 2002-10-15 

; PRIOR APPLICATION NUMBER: US 09/924,340 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: US 60/305,456 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/302,277 

; PRIOR FILING DATE: 2001-06-29 

; PRIOR APPLICATION NUMBER: US 60/298,698 

; PRIOR FILING DATE: 2001-06-15 

; PRIOR APPLICATION NUMBER: US 60/293,574 

; PRIOR FILING DATE: 2001-05-25 

; NUMBER OF SEQ ID NOS : 112 

; SOFTWARE: JPatent 

; SEQ ID NO 82 

LENGTH: 221 

TYPE: PRT 
; ORGANISM: Homo sapiens 

FEATURE : 

NAME/KEY: SIGNAL 
LOCATION: -32.. -1 
US-10-154-678-82 

Query Match 13.9%; Score 200; DB 14; Length 221; 

Best Local Similarity 45.7%; Pred. No. l.le-11; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

III I : III : | | | I : I I II I I : I I I : hi 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT-FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGI GSLI DFI LI SMQIVGPSDGS S YI 244 

I I : I I I I I I I I I I III : I I I : I I I : I I : I I I : I I I I I 

Db 166 LSITLGGFGADRFYLGQWREGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



Search completed: March 4, 2004, 10:46:22 
Job time : 653 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
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Search time 8 0 Seconds 

(without alignments) 

1060.930 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



US-09-852-100B-2 
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1 MHILKGS PNVI PRAHGQKNT TRLTRLSITNETFRKTQLYP 269 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



1017041 seqs, 315518202 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
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Listing first 45 summaries 
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SPTREMBL 25:* 
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17 



sp_archea : * 
sp__bacteria : * 
sp_f ungi : * 
sp_human: * 
sp_invertebrate : * 
sp_mammal : * 
sp_mhc : * 
sp_organelle : * 
sp_phage: * 

sp_plant : * 

sp_rodent : * 

sp_virus : * 

sp_vertebrate : * 

sp_unclassif ied: * 

sp rvirus:* 

sp_bacteriap: * 

sp_archeap : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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7ALIGNMENTS 



RESULT 1 
Q9BX74 

ID Q9BX74 PRELIMINARY; 
AC Q9BX74; 

DT 01-JUN-2001 (TrEMBLrel. 
DT 01-JUN-2001 (TrEMBLrel. 
DT 01-JUN-2003 (TrEMBLrel. 



PRT; 2 07 7AA. 
17, Created) 

17, Last sequence update) 
24, Last annotation update) 



DE Beta-amyloid binding protein. 

GN BBP. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21276355; PubMed=1127 884 9; 

RA Kajkowski E.M., Lo C.F., Ning X., Walker S., Sofia H.J., Wang W., 

RA Edris W., Chanda P., Wagner E., Vile S., Ryan K., McHendry-Rinde B., 

RA Smith S.C., Wood A., Rhodes K.J., Kennedy J.D., Bard J., 

RA Jacobsen J.S., Ozenberger B.A-; 

RT "beta-Amyloid Peptide-induced Apoptosis Regulated by a Novel Protein 

RT Containing a G Protein Activation Module."; 

RL J. Biol. Chem. 276:18748-18756(2001). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Testis; 

RA Strausberg R. ; 

RL Submitted (MAY-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AF353990; AAK35064.1; -. 

DR EMBL; BC029486; AAH29486.1; -. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

KW Signal. 

FT SIGNAL 1 37 POTENTIAL. 

SQ SEQUENCE 207 AA; 22326 MW; A5590FD7AECDF292 CRC64; 

Query Match 77.3%; Score 1113; DB 4; Length 207; 

Best Local Similarity 100.0%; Pred. No. 1.8e-94; 

Matches 207; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 63 MAAAWPSGPSAPEAVTARLVGVLWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYICKD 122 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAAAW P S G P SAP EAVT ARLVGVLW FVS VT T G PWGAVAT SAG GEE S LKC E D LKVGQ Y I C KD 60 

Qy 123 PKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYS 182 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I 
Db 61 PKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNWGYS 12 0 

Qy 183 YKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSS 242 

I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I 
Db 121 YKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSS 180 

Qy 243 Y 1 1 D Y YGT RLT RLS I TN ET FRKTQL Y P 269 

I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 181 YIIDYYGTRLTRLSITNETFRKTQLYP 207 



RESULT 2 
Q99MB3 

ID Q99MB3 PRELIMINARY; PRT; 208 AA. 

AC Q99MB3; 

DT 01-JUN-2001 (TrEMBLrel- 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 



DE Beta-amyloid binding protein. 

GN BBP . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BALB/c; 

RX MEDLINE=21276355; PubMed=1127884 9; 

RA Kajkowski E.M., Lo C.F., Ning X., Walker S., Sofia H.J., Wang W., 

RA Edris W., Chanda P., Wagner E., Vile S., Ryan K. , McHendry-Rinde B . , 

RA Smith S.C., Wood A., Rhodes K.J., Kennedy J.D., Bard J., 

RA Jacobsen J.S., Ozenberger B.A.; 

RT "beta-Amyloid Peptide-induced Apoptosis Regulated by a Novel Protein 

RT Containing a G Protein Activation Module."; 

RL J. Biol. Chem. 276:18748-18756(2001). 

DR EMBL; AF353993; AAK35067.1; -. 

DR MGD; MGI : 2137022; Bbp . 

DR GO; GO: 0005887; C: integral to plasma membrane; IDA. 

DR GO; GO:0001540; F: beta-amyloid binding; IPI. 

DR GO; GO: 0004930; F:G-protein coupled receptor activity; IDA. 

DR GO; GO:0008624; P:induction of apoptosis by extracellular sig. . .; IDA. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

SQ SEQUENCE 208 AA; 22271 MW; 91A7 932 163F4F04C CRC64; 

Query Match 65.4%; Score 941.5; DB 11; Length 208; 

Best Local Similarity 85.1%; Pred. No. l.le-78; 

Matches 177; Conservative 10; Mismatches 20; Indels 1; Gaps 1 

Qy 63 MAAAW P S G P SAP EAVTARLVGVLW FVS VT T G P WGAVAT S A- GGEE S LKC ED LKVGQ Y I C K 121 

I I I I I I : I :: I I I : M I : I I I I I : I I I I I : I I I I I : I I I I I I I 

Db 1 MAAAWPAGRASPAAGPPGLLRTLWLWVAAGHCGAAASGAVGGEETPKCEDLRVGQYICK 60 

Qy 122 DPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGY 181 

: I I I I I I I I I I I I I I I I I I I I Mill I I I I I I I I I I I I I I : I I M I I I I II I M I I 
Db 61 EPKINDATQEPVNCTNYTAHVQCFPAPKITCKDLSGNETHFTGSEVGFLKPISCRNVNGY 120 

Qy 182 SYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGS 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 121 SYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGS 180 

Qy 242 SYIIDYYGTRLTRLSITNETFRKTQLYP 269 

I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 SYIIDYYGTRLTRLSITNETFRKTQLYP 208 



RESULT 3 
Q9W2H1 

ID Q9W2H1 PRELIMINARY; PRT; 178 AA. 

AC Q9W2H1; 

DT 01-MAY-2000 (TrEMBLrel . 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE CG10795 protein (LD27358P) . 

GN CG10795. 



OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE=20196006; PubMed-10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A. , Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A,, Galle R.F., 

RA George R.A. , Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pf annkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA BeesonK.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S. f 

RA Borkova D . , Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A. , Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B . , Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F., Gorrell J.H., Gu Z . , Guan P., Harris M. , 

RA Harris N.L., Harvey D. , Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A. , Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F., Karpen G.H., Ke Z., Kennison J.A. , Ketchum K.A. , 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A. A. , Li J., Li Z., Liang Y., Lin X., 

RA Liu X., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S. f Pollard J., Puri V., Reese M.G., 

RA Reinert K., Remington K. , Saunders R.D.C., Scheeler F. , Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T., 

RA Spier E., Spradling A.C., Stapleton M. , Strong R. , Sun E., 

RA Svirskas R. , Tector C, Turner R., Venter E . , Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T . , Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G. , Zhao Q. , Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W. , Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M. , Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RA Stapleton M. , Brokstein P., Hong L. , Agbayani A., Carlson J., 

RA Champe M. , Chavez C, Dorsett V., Farfan D., Frise E., George R. , 

RA Gonzalez M., Guarin H., Li P., Liao G. , Miranda A., Mungall C.J., 

RA Nunoo J., Pacleb J., Paragas V., Park S., Phouanenavong S., Wan K. , 

RA Yu C, Lewis S.E., Rubin G.M., Celniker S.; 

RL Submitted (OCT-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AE003453; AAF46720.1; 



DR EMBL; AY061343; AAL28891.1; 

DR FlyBase; FBgn0034626; CG10795. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

SQ SEQUENCE 178 AA; 19896 MW; 17C4 1166607ACC03 CRC64; 

Query Match 23.5%; Score 338; DB 5; Length 178; 

Best Local Similarity 42.6%; Pred. No. 3.8e-23; 

Matches 69; Conservative 30; Mismatches 49; Indels 14; Gaps 5; 

Qy 107 SLKCEDLK-VGQYICKDP KINDATQEPVNCTNY-TAHVSCFPAPNITCKDSSGNETH 161 

: : I : I : : M : : I I I : I : I I : II I I I I I l : : I I I 

Db 2 0 NVDCNELQMMGQFMCPDPARGQIDPKTQQLAGCTREGRARVWCIAANEINCTE-TGNAT- 77 

Qy 162 FTGNEVGFFKPISCRNWGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGI 221 

| : : I : I I I : I I : I I I I I I I I I I I I : I I I I I I I : I : 

Db 78 FTREVPCKWTNGYHLDTTLLLSVFLGMFGVDRFYLGYPGIGLLKFCTLGGMFL 130 

Qy 222 GSLIDFILISMQIVGPSDGSSYIIDYYGTRLTRLSITNETFR 263 

I I I I : I I : : I : I I I : I I I : I : I I I I : : I hi 

Db 131 GQLI DI VLI ALQWGPADGS AYVI P YYGAGI HI VRS DNTT YR 172 



RESULT 4 
Q95PJ8 

ID Q95PJ8 PRELIMINARY; PRT; 329 AA. 

AC Q95PJ8; 

DT 01-DEC-2001 (TrEMBLrel. 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Y66D12A.21 protein. 

GN Y66D12A.21. 

OS Caenorhabditis elegans . 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID-6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Sulston J.E. ; 

RL Submitted (OCT-2001) to the EMBL/ GenBank/DDB J databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99069613; PubMed=9851916 ; 

RA none ; 

RT "Genome sequence of the nematode C. elegans: A platform for 

RT investigating biology."; 

RL Science 282:2012-2018(1998). 

DR EMBL; AL161712; CAC35892.1; -. 

DR WormPep; Y66D12A.21; CE26465. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

SQ SEQUENCE 329 AA; 38167 MW; 9C6FB3EE7E3866D0 CRC64; 



Query Match 19.3%; Score 278; DB 5; Length 329; 

Best Local Similarity 36.6%; Pred. No. 2.8e-17; 

Matches 63; Conservative 34; Mismatches 59; Indels 16; Gaps 



5; 



Qy 98 VATSAGGEESLKCEDLKVGQYICKDPKINDATQEPWC-TNYTAHVSCFPAPNITC — KD 154 

: : I I : : : | | : | | I I : I I : : : I I : I I : : II I : I I I I 
Db 11 I SVSA- SDATVKCDDLDPNQYLCKNYAVT)TITQQSVTCAADNS IQVMCETAEHI KCVGKD 69 

Qy 155 SSG— NETHFTGNEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLK 212 

I || I : I II I I I : II I : I I I I II I I I I I : I 

Db 70 QFGIFNRT VPSACHYGAHVSYTTTVLLSIFLGFFGIDRIYLGYYALGLIK 119. 

Qy 213 FCTVGFCGIGSLIDFILISMQIVGPSDGSSYIIDYYGTRLTRLSITNETFRK 264 

: : I : I : I I I I I : I : : I I : I I : : I : I I I : : : : : I 
Db 120 MFSLGGLFVFWLVDIILISLQLLGPADGTAYAMAYYGPKAQMIRLVATIWKK 171 



RESULT 5 
Q9H651 

ID Q9H651 PRELIMINARY; PRT; 221 AA. 

AC Q9H651; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Hypothetical protein FLJ22604 (BBP-like protein 2). 

GN BLP2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Small intestine; 

RA Watanabe K., Kumagai A., Itakura S., Yamazaki M. , Tashiro H. , Ota T., 

RA Suzuki Y., Obayashi M. f Nishi T., Shibahara T., Tanaka T., 

RA Nakamura Y., Isogai T., Sugano S.; 

RT "NEDO human cDNA sequencing project."; 

RL Submitted (AUG-2000) to the EMBL/ GenBank/ DDB J databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Kajkowski E.M., Lo C.F., Ning X., Walker S., Sofia H.J., Wang W., 

RA Edris W., Chanda P., Wagner E., Vile S., Ryan K., McHendry-Rinde B., 

RA Smith S.C., Wood A., Rhodes K.J., Kennedy J.D., Bard J., 

RA Jacobsen J.S., Ozenberger B.A. ; 

RT "Beta-amyloid peptide-induced apoptosis regulated by a novel protein 

RT containing a G protein activation module."; 

RL J. Biol. Chem. 0:0-0(2001). 

RN [3] 

RP SEQUENCE FROM N.A, 

RC TISSUE=Muscle; 

RA Strausberg R. ; 

RL Submitted (MAY-2001) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AK026257; BAB15415.1; 

DR EMBL; AF353992; AAK35066.1; 

DR EMBL; BC008873; AAH08873.1; -. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 221 AA; 24410 MW; 92151D6EF6363D74 CRC64; 



Query Match 



13.9%; Score 200; DB 4; Length 221; 



Best Local Similarity 45.7%; Pred. No. 2.6e-10; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 



Qy 135 CTNYTA — HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

Mil: III : II I I : I I II I I : I I I : I : I 

Db 112 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 165 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I II I I I III : I I I : I I I : I I : I I I : I I I I I 

Db 166 LSITLGGFGADRFYLGQWREGLGKLFSFGGLGIWTLIDVLLIGVGYVGPADGSLYI 221 



RESULT 6 
Q9BRN9 

ID Q9BRN9 PRELIMINARY; PRT; 247 AA. 

AC Q9BRN9; 

DT 01-JUN-2001 (TrEMBLrel. 17 f Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Similar to hypothetical protein FLJ22604. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Colon; 

RA Strausberg R. ; 

RL Submitted (APR-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC006150; AAH06150.1; -. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2 ; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 247 AA; 27161 MW; CE1D0D9C53DDF73C CRC64 ; 

Query Match 13.9%; Score 200; DB 4; Length 247; 

Best Local Similarity 45.7%; Pred. No. 3e-10; 

Matches 53; Conservative 12; Mismatches 39; Indels 12; Gaps 5 

Qy 135 CTNYTA— HVSC FPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYKVAVA 188 

III I : III : I I I I : I I II I I : I I I : hi 

Db 138 CTNSTSCMTVSCPRQRYPA-NCTVRD HVHCLGNRT- FPKMLYCNWTGGYKWSTALA 191 

Qy 189 LSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I : I I I I I I I I I I III : I I I : I I I : I I : I I I : I I I I I 

Db 192 LS ITLGGFGADRFYLGQWREGLGKLFS FGGLGIWTLI DVLLI GVGYVGPADGSLYI 247 



RESULT 7 
Q8BJ83 

ID Q8BJ83 PRELIMINARY; PRT; 261 AA. 

AC Q8BJ83; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Similar to BBP-like protein 2. 

GN 5930422O05RIK. 



OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Forelimb ; 

RX MEDLINE=22354683; PubMed=12466851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 



RL Nature 420:563-573(2002). 

DR EMBL; AK077858; BAC37037.1; 

DR MGD; MGI:1924429; 5930422O05Rik . 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2 ; 1. 

SQ SEQUENCE 261 AA; 28880 MW; 7 03467 8 0D3CF5CDB CRC64; 

Query Match 13.8%; Score 198.5; DB 11; Length 261; 

Best Local Similarity 26.8%; Pred. No. 4.4e-10; 

Matches 69; Conservative 23; Mismatches 86; Indels 79; Gaps 6 

Qy 46 LLGGGGSGSGEKVSVSKMAAAWPSGPSAPEAWARLVGVLWFVSVTTGPWGAVATSAGGE 105 

: I I I I : I I III: Ml I : I 
Db 26 ILSGDGSLNLEHSQPLAQAIKDP-GPTRTFSWPRAAENQLFSHLT 70 

Qy 106 ESLKCEDLKVGQYICKDPKINDATQEPVNCTNYTAHVS CFPAPNITCKDS 155 

I : : I : I I : : I : I : I I I I : : I I I 

Db 71 ESTEIPPYMTKCPSNGLCSRLPADCIECATNVSCTYGKPVTFDCTVKPSVTCVDQ 125 

Qy 156 S GNET HFTGNEV 167 

II III 
Db 126 DLKPQRNFVINMTCRFCWQLPETDYECSNSTTCMTVACPRQRYFANCTVRDHIHCLGNRT 185 

Qy 168 GFFKP I S CRNVNGYS YKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGI GS LI DF 227 

I I : I II: I : I I I : I I I I I I I I I I III : I I I : I I I 

Db 186 -FPKLLYCNWTGGYKWSTALALSITLGGFGADRFYLGQWREGLGKLFSFGGLGIWTLIDV 244 

Qy 228 ILISMQIVGPSDGSSYI 244 

: I I : I I I : I I I I I 
Db 245 LLIGVGYVGPADGSLYI 261 



RESULT 8 
Q9D156 

ID Q9D156 PRELIMINARY; PRT; 230 AA. 

AC Q9D156; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE 1110025l09Rik protein (RIKEN cDNA 1110025109 gene) . 

GN BLP2 OR 1110025I09RIK. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI TaxID=10090; 



RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Embryo ; 

RX MEDLINE-21085660; PubMed=11217851 ; 

RA Kawai J., Shinagawa A. , Shibata K., Yoshino M., Itoh M. , Ishii Y. r 

RA Arakawa T., Hara A. , Fukunishi Y., Konno H., Adachi J., Fukuda S., 

RA Aizawa K. , Izawa M. , Nishi K., Kiyosawa H., Kondo S., Yamanaka I., 

RA Saito T., Okazaki Y., Gojobori T., Bono H., Kasukawa T., Saito R. , 

RA Kadota K., Matsuda H.A., Ashburner M. , Batalov S., Casavant T., 

RA Fleischmann W., Gaasterland T., Gissi C, King B. , Kochiwa H., 

RA Kuehl P., Lewis S., Matsuo Y., Nikaido I., Pesole G., Quackenbush J., 

RA Schriml L.M., Staubli F. , Suzuki R. , Tomita M. , Wagner L. f Washio T., 

RA Sakai K. , Okido T . , Furuno M., Aono H., Baldarelli R. , Barsh G., 

RA Blake J., Boffelli D . , Bojunga N., Carninci P., de Bonaldo M.F., 

RA Brownstein M.J., Bult C, Fletcher C, Fujita M. , Gariboldi M., 

RA Gustincich S. f Hill D. , Hofmann M. , Hume D.A., Kamiya M. , Lee N.H., 

RA Lyons P . , Marchionni L., Mashima J., Mazzarelli J., Mombaerts P. f 

RA Nordone P., Ring B., Ringwald M. , Rodriguez 1., Sakamoto N . , 

RA Sasaki H., Sato K. , Schoenbach C. , Seya T- f Shibata Y. f Storch K.-F., 

RA Suzuki Toyo-oka K. , Wang K.H., Weitz C, Whittaker C, Wilming L., 

RA Wynshaw-Boris A., Yoshida K. , Hasegawa Y., Kawaji H., Kohtsuki S., 

RA Hayashizaki Y. ; 

RT "Functional annotation of a full-length mouse cDNA collection."; 

RL Nature 409:685-690(2001). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Salivary gland; 

RA Strausberg R. ; 

RL Submitted (MAR-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AK003917; BAB23075.1; -. 

DR EMBL; BC024620; AAH24620.1; 

DR MGD; MGI: 1915884; Blp2 . 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

SQ SEQUENCE 230 AA; 25639 MW; 396D650D8BEE99A5 CRC64; 

Query Match 13.6%; Score 196; DB 11; Length 230; 

Best Local Similarity 28.9%; Pred. No. 6.3e-10; 

Matches 59; Conservative 19; Mismatches 64; Indels 62; Gaps 5 

Qy 102 AGGEESL KCEDLKVGQYI CKDPKINDATQEPVNCTNYTAHVS CFPAP 148 

:| I I | :: |: I I : : I : I :|| I I 

Db 28 SGDENQLFSHLTESTEIPPYMTKCPSNGLCSRLPADCIECATNVSCTYGKPVTFDCTVKP 87 

Qy 149 NITCKDSS GNET 160 

Db 88 SVTCVDQDLKPQRNFVINMTCRFCWQLPETDYECSNSTTCMTVACPRQRYFANCTVRDHI 147 

Qy 161 HFTGNEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCG 220 

I || I I : I II: I : I I I : I I 11111111 III : I I 

Db 148 HCLGNRT-FPKLLYCNWTGGYKWSTALALSITLGGFGADRFYLGQWREGLGKLFSFGGLG 206 

Qy 221 IGSLIDFILISMQIVGPSDGSSYI 244 

I : I I I : I I : I I I : I I I I I 
Db 207 IWTLIDVLLIGVGYVGPADGSLYI 230 



RESULT 9 
Q9U4H5 

ID Q9U4H5 PRELIMINARY; PRT; 2 84 AA. 

AC Q9U4H5; Q9W361; 

DT Ol-MAY-2000 (TrEMBLrel . 13, Created) 

DT Ol-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE BCDNA. GH02974 (ALMONDEX) (AMX protein) . 

GN AMX OR BCDNA: GH02974 OR CG12127. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Rubin G.M., Wan K.H., Harvey D., Lewis S.E., Brokstein P., Tsang G., 

RA Agbayani A., Arcaina T.T., Baxter E., Blazej R.G., Butenhoff C, 

RA Champe M. , Chavez C, Chew M., Doyle CM., Farfan D.E., Frise E - , 

RA Galle R. , George R.A., Harris N.L., Hoskins R. A, , Evans -Holm M. , 

RA Houston K.A., Hummasti S.R., Kim E. f Li P., Moshrefi M. , Pacleb J.M., 

RA Park S., Sequeira A., Sethi H., Snir E . , Svirskas R.R., Weinburg T., 

RA Celniker S.E. ; 

RT "Full Length Drosophila melanogaster cDNA sequence."; 

RL Submitted (AUG-1999) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Ovary; 

RA Michellod M.-A.E., Remillieux N.C., Randsholt N.B.; 

RT "Characterization of almondex."; 

RL Submitted (MAY-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE-20196006; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. f Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A. , Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C. r Baxter E.G., Helt G. , Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H--J-, Andrews-Pf annkoch C. , Baldwin D., 

RA Ballew R.M. , Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M. , Cawley S., Dahlke C. , Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P. 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W. 

RA Fosler C./ Gabrielian A.E., Garg N.S., Gelbart W.M. , Glasser K., 

RA Glodek A., Gong F. , Gorrell J.H., Gu Z . , Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A. , Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F. , Karpen G.H., Ke Z . , Kennison J. A. , Ketchum K.A. 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y. , Levitsky A. A. , Li J., Li Z., Liang Y . , Lin X., 



RA Liu X., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J. , Moshrefi A. , 

RA Mount S.M., Moy M. , Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C., Scheeler F., Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T., 

RA Spier E., Spradling A.C., Stapleton M. , Strong R. , Sun E. , 

RA Svirskas R. , Tector C, Turner R., Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J. , 

RA Williams S.M., Woodage T., Worley K.C., Wu D. , Yang S., Yao Q.A. , 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G., Zhao Q. , Zheng L., 

RA Zheng X.H., Zhong F.N. r Zhong W. , Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A. , Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000). 

DR EMBL; AF181623; AAD55409.1; 

DR EMBL; AF217797; AAF36924.2; -. 

DR EMBL; AE003446; AAF46474.2; -. 

DR FlyBase; FBgn0000077; amx. 

DR GO; GO:0007498; P: mesoderm development; IMP. 

DR InterPro; IPR001304; Lectin_C. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

DR PROSITE; PS00615; C_TYPE_LECTIN_1 ; 1. 

SQ SEQUENCE 284 AA; 31364 MW; 8FB8FFB5733AC851 CRC64; 

Query Match 12.6%; Score 182; DB 5; Length 284; 

Best Local Similarity 33.3%; Pred. No. 1.6e-08; 

Matches 50; Conservative 21; Mismatches 51; Indels 28; Gaps 5 

Qy 104 GEESLK CEDLKVGQYI CKDPKINDATQEPVNCTNYTAH — VSCFPAPNITCKD 154 

III: I :: I I : : : : I I I I I I 
Db 154 GERSFQRQMNCRYCYQTEMWQQSCGQRSSCNSATDKLFRTNCTVHHDVLCL 204 

Qy 155 SSGNETHFTGNEVGFFKPISCRNWGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFC 214 

I I : I I I : I II: I : : I I I I I I I II I I I : hi 

Db 205 — GNRS-FTRN LRCNWTQGYRWSTALLISLTLGGFGADRFYLGHWQEGIGKLF 254 



Qy 215 TVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

: I I : : : I I : I I I I : I I : I I I I I 
Db 255 S FGGLGVWT 1 1 DVLLI SMH YLGPADGS L YI 284 



RESULT 10 
Q9H046 

ID Q9H046 PRELIMINARY; PRT; 80 AA. 

AC Q9H04 6; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Hypothetical protein (Fragment) . 

GN DKFZP667C1011. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI TaxID=9606; 



RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


TISSUE=Lymph node; 




RA 


Koehrer K. , Beyer A. , Mewes 


n. . w . , wei J. d • r wiciuann o • r 


RL 


Submitted (DEC-2000) to the 


EMBL/ GenBank/ DDB J databases . 


DR 


EMBL; ALolzDoy; CACzlo4/.l; 




DR 


InterPro; IPR007829; TM2 . 




DR 


Pfam; PF05154; TM2; 1. 




KW 


Hypothetical protein. 




FT 


NON_TER 1 1 




SQ 


SEQUENCE 80 AA; 8699 MW; 


8BE6BE788235C58D CRC64; 



Query Match 12.0%; Score 172; DB 4; Length 80; 

Best Local Similarity 46.6%; Pred. No. 2.7e-08; 

Matches 41; Conservative 9; Mismatches 30; Indels 8; Gaps 1 

Qy 157 GNETHFTGNEVGFFKPISCRNWGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTV 216 

I I I I I : I II: I : I I I : I I I I I I I I I I III : 

Db 1 GNRT FPKMLYCNWTGGYKWSTALALSITLGGFGADRFYLGQWREGLGKLFSF 52 

Qy 217 GFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I I : I I I : I I : I I I : I I I I I 
Db 53 GGLGIWTLI DVLLI GVGYVGPADGSLYI 80 



RESULT 11 
Q95QZ5 

ID Q95QZ5 PRELIMINARY; PRT; 195 AA. 

AC Q95QZ5; 

DT 01-DEC-2001 (TrEMBLrel . 19, Created) 

DT 01-DEC-2001 (TrEMBLrel . 19, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Hypothetical protein. 

GN C41D11.9. 

OS Caenorhabditis elegans . 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID=6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bristol N2; 

RX MEDLINE=99069613; PubMed=9851916; 

RA None; 

RT "Genome sequence of the nematode C. elegans: a platform for 

RT investigating biology. The C. elegans Sequencing Consortium."; 

RL Science 282:2012-2018(1998). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-Bristol N2 ; 

RA Gattung S., Maggi L.; 

RT "The sequence of C. elegans cosmid C41D11."; 

RL Submitted (MAY-1997) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bris tol N2 ; 

RA Waterston R. ; 

RT "Direct Submission."; 



RL Submitted (OCT-2001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF003740; AAL08031.1; -. 

DR WormPep; C41D11.9; CE29489. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 195 AA; 21203 MW; 35945E407F184DAE CRC64; 

Query Match 11.8%; Score 169.5; DB 5; Length 195; 

Best Local Similarity 28.9%; Pred. No. 1.4e-07; 

Matches 48; Conservative 27; Mismatches 62; Indels 29; Gaps 5; 

Qy 104 GEESLKCE DLKVGQYICKDPKINDATQE PVNCTNYTA HVSCF 145 

| || I : : I : I : I : : : I I : I I I 

Db 34 GSAGLTCTFPGDCRIGDTV KVNCTSRKGCPNPVSRNNVEAVCRFCWQLLPGDYDCE 89 

Qy 146 PAPNITCKDS S GN ETH FT GN EVG F FK P I S C RNVN G YS YKVAVAL S LFL GWLGA 198 

|||: : I : : : hill : I I I : : I I : I I II 

Db 90 PATNCST S STKLLVTKCSAHS S VI CMGQRNFYKRI PCNWS S GYSWTKTMI LS WLGGFGA 149 

Qy 199 DRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I I I I I I : I : I | : : I : I : I I : : : I I I I I I 

Db 150 DRFYLGLWKSAIGKLFSFGGLGVWTLVDWLIAVGYIKPYDGSMYI 195 



RESULT 12 
Q9BX7 3 

ID Q9BX73 PRELIMINARY; PRT; 214 AA. 

AC Q9BX73; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE BBP-like protein 1. 

GN BLP1. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21276355; PubMed=1127 8 849 ; 

RA Kajkowski E.M., Lo C.F., Ning X., Walker S., Sofia H.J., Wang W. , 

RA Edris W., Chanda P., Wagner E., Vile S., Ryan K. , McHendry-Rinde B., 

RA Smith S.C., Wood A., Rhodes K.J., Kennedy J.D., Bard J., 

RA Jacobsen J.S., Ozenberger B.A. ; 

RT "beta-Amyloid Peptide-induced Apoptosis Regulated by a Novel Protein 

RT Containing a G Protein Activation Module."; 

RL J. Biol. Chem. 276:18748-18756(2001). 

DR EMBL; AF353991; AAK35065.1; 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2; 1. 

SQ SEQUENCE 214 AA; 22871 MW; BB928712AF2F78A8 CRC64; 

Query Match 9.5%; Score 136.5; DB 4; Length 214; 

Best Local Similarity 27.8%; Pred. No. 0.00018; 

Matches 58; Conservative 21; Mismatches 83; Indels 47; Gaps 10; 



Qy 59 SVSKMAAAWP SGPSAPEAVTARLVGVLWFVSVTTGPWGAVATSAGGEES — LKCED 112 

11:111 : I : I I MM: I I : I 

Db 33 S H S QNAT AE PELT S AGAAQ P E GPGGAASWEYGDPHSPVILCSY 75 

Qy 113 LKVGQYICKDP — KINDAT — QE-PVNCTNYTAH VSCFPAPNITCKDSSGN 158 

I I : II : : I I I I I : II I I : 
Db 76 LPDEFIECEDPVDHVGNATASQELGYGCLKFGGQAYSDVEHTSVQCHALDGIEC ASP 132 

Qy 159 ETHFTGNEVGFFKPISCRNWGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGF 218 

I I Ml I : : : I I II I I I I II : : I I : I 

Db 133 RTFLREN KP — CIKYTGHYFITTLLYSFFLGCFGVDRFCLGHTGTAVGKLLTLGG 185 

Qy 219 CGIGSLIDFILISMQIVGPSDGSSYIIDY 247 

II :| II: : Mill:: I 

Db 186 LGIWWFVDLILLITGGLMPSDGSNWCTVY 214 



RESULT 13 
Q9VY86 

ID Q9VY86 PRELIMINARY; PRT; 172 AA. 

AC Q9VY86; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-OCT-2002 (TrEMBLrel . 22, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE CG11103 protein (LP03404p) . 

GN CG11103. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE=20196006; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A. , Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A. , Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pf annkoch C, Baldwin D. , 

RA Ballew R.M. , Basu A., Baxendale J., Bayraktaroglu L. , Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D . , Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A. , Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K . , Doup L.E., Downes M. f Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W. f 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F. , Gorrell J.H., Gu Z . , Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A. , Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F., Karpen G.H., Ke Z., Kennison J. A. , Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D . , Lai Z., 

RA Lasko P., Lei Y., Levitsky A. A. , Li J., Li Z., Liang Y., Lin X., 



RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L., Muzny D.M. , Nelson D.L., 

RA Nelson D.R., Nelson K. A., 'Nixon K. , Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. f Pittman G.S., Pan S., Pollard J., Puri V. , Reese M.G., 

RA Reinert K. , Remington K. , Saunders R.D.C., Scheeler F,, Shen H. , 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T., 

RA Spier E. , Spradling A.C., Stapleton M., Strong R. , Sun E. , 

RA Svirskas R. , Tector C, Turner R. f Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. 9 Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N . , Zhong W. , Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A. , Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Celniker S.E., Adams M.D., Kronmiller B. , Wan K.H., Holt R.A. , 

RA Evans C.A. , Gocayne J.D., Amanatides P.G., Brandon R.C. f Rogers Y. f 

RA Banzon J. , An H., Baldwin D., Banzon J., Beeson K.Y., Busam D. A. , 

RA Carlson J.W. , Center A., Champe M. , Davenport L.B., Dietz S.M., 

RA Dodson K., Dorsett V., Doup L.E, f Doyle C, Dresnek D., Farfan D., 

RA Ferriera S., Frise E., Galle R.F., Garg N.S., George R.A. , 

RA Gonzalez M. , Houck J. , Hoskins R.A. , Hostin D., Howland T.J., 

RA Ibegwam C, Jalali M. , Kruse D., Li P., Mattei B., Moshrefi A., 

RA Mcintosh T.C., Moy M. , Murphy B. , Nelson C, Nelson K.A. , Nunoo J., 

RA Pacleb J., Paragas V., Park S., Patel S., Pfeiffer B., 

RA Phouanenavong S., Pittman G.S., Puri V., Richards S., Scheeler F., 

RA Stapleton M. f Strong R. , Svirskas R. , Tector C. f Tyler D., 

RA Williams S.M., Zaveri J.S., Smith H.O., Venter J.C., Rubin G.M. ; 

RT "Sequencing of Drosophila melanogaster genome."; 

RL Submitted (MAR-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Misra S., Crosby M.A. , Matthews B.B., Bayraktaroglu L., Campbell K. f 

RA Hradecky P., Huang Y. , Kaminker J.S., Prochnik S.E., Smith CD., 

RA Tupy J.L., Bergman C, Berman B. f Carlson J.W., Celniker S.E., 

RA Clamp M. , Drysdale R. , Emmert D. f Frise E., de Grey A., Harris N., 

RA Kronmiller B. , Marshall B., Millburn G-, Richter J., Russo S., 

RA Searle S.M.J. , Smith E . , Shu S., Smutniak F., Whitfield E., 

RA Ashburner M., Gelbart W.M. , Rubin G.M., Mungall C.J., Lewis S.E.; 

RT "Annotation of Drosophila melanogaster genome."; 

RL Submitted (MAR-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Adams M.D., Celniker S.E., Gibbs R.A. , Rubin G.M., Venter C.J.; 

RL Submitted (MAR-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [5] 

RP SEQUENCE FROM N.A. 

RA FlyBase; 

RL Submitted (SEP-2002) to the EMBL/ GenBank/DDBJ databases. 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN-Berkeley; 

RA Stapleton M., Brokstein P., Hong L., Agbayani A. , Carlson J., 

RA Champe M., Chavez C, Dorsett V. , Dresnek D., Farfan D., Frise E., 



RA George R. , Gonzalez M. , Guarin H., Kronmiller B., Li P., Liao G., 

RA Miranda A., Mungall C.J., Nunoo J., Pacleb J., Paragas V. , Park S., 

RA Patel S., Phouanenavong S., Wan K. , Yu C, Lewis S.E., Rubin G.M., 

RA Celniker S . ; 

RL Submitted (JUN-2002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AE003493; AAF48318.2; 

DR EMBL; AY119007; AAM50867.1; 

DR FlyBase; FBgn0030522; CG11103. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2 ; 1. 

SQ SEQUENCE 172 AA; 18809 MW; 73DFFB4B1B7 8 0E39 CRC64 ; 

Query Match 8.7%; Score 125; DB 5; Length 172; 

Best Local Similarity 25.9%; Pred. No. 0.0015; 

Matches 38; Conservative 18; Mismatches 49; Indels 42; Gaps 5 

Qy 120 CKDP KINDATQEPVN CTNYTAHVSCFPAPNITCKDSS 156 

I I I I I : I I I I I : I : I : 
Db 41 CKDPVT)HRENATAQQEKKYGCLKFGGSTYEEVEHAMVWCTVF-ADIECY — -: 88 

Qy 157 GNETHFTGNEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTV 216 

I I I : I : : : I : I I : I I I I I I I : I I : 

Db 89 GNRTFLRAG VPCVRYTDHYFVTTLIYSMLLGFLGMDRFCLGQTGTAVGKLLTM 141 



Qy 217 GFCGIGSLIDFILISMQIVGPSDGSSY 243 

I I : : I I I I : : I I I I : : 
Db 142 GGVGVWWIIDVILLITNNLLPEDGSNW 168 



RESULT 14 
Q9BSR6 

ID Q9BSR6 PRELIMINARY; PRT; 149 AA. 

AC Q9BSR6; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2 003 (TrEMBLrel. 24, Last annotation update) 

DE Similar to RIKEN cDNA 2410018G23 gene. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC TISSUE=Pancreas ; 

RA Strausberg R. ; 

RL Submitted (MAR-2001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC004878; AAH04878.1; -. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2 ; 1. 

SQ SEQUENCE 149 AA; 16490 MW; 7 49C1C574CB2A518 CRC64; 

Query Match 8.4%; Score 121; DB 4; Length 149; 

Best Local Similarity 29.6%; Pred. No. 0.003; 

Matches 42; Conservative 16; Mismatches 60; Indels 24; Gaps 7 

Qy 120 CKDP — KINDAT — QE- PVNCTN YTAH VSCFPAPNITCKDSSGNETHFTGN 165 

I : I I : : I I I I I : II I I : I I 



Db 



18 CEDPVDHVGNATASQELGYGCLKFGGQAYSDVEHTSVQCHALDGIEC ASPRTFLREN 74 



Qy 166 EVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLI 225 

II I I : : : I I I I I III M : : I I : I II - 

Db 75 KP — CIKYTGHYFITTLLYSFFLGCFGVDRFCLGHTGTAVGKLLTLGGLGIWWFV 127 

Qy 226 DFILISMQIVGPSDGSSYIIDY 247 

III: : I I I I I : : I 
Db 128 DLILLITGGLMPSDGSNWCTVY 14 9 



RESULT 15 
Q8N0X9 

ID Q8N0X9 PRELIMINARY; PRT; 171 AA. 

AC Q8N0X9; 

DT Ol-OCT-2002 (TrEMBLrel. 22, Created) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Hypothetical protein FLJ90546 (Hypothetical protein FLJ90674) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Ovarian carcinoma, and Placenta; 

RA Isogai T., Ota T., Nishikawa T., Hayashi K., Otsuki T . , Sugiyama T., 

RA Suzuki Y., Nagai K., Sugano S., Ishii S., Kawai-Hio Y., Saito K. , 

RA Yamamoto J., Wakamatsu A., Nakamura Y., Kojima S., Nagahari K. , 

RA Masuho Y. , Ono T., Okano K . , Yoshikawa Y., Aotsuka S., Sasaki N., 

RA Hattori A., Okumura K., Iwayanagi T . , Ninomiya K. ; 

RT "NEDO human cDNA sequencing project."; 

RL Submitted (MAR-2002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AK075027; BAC11359.1; -. 

DR EMBL; AK075155; BAC11437.1; -. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF05154; TM2 ; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 171 AA; 19012 MW; CD9F9F7CEE5CB38 8 CRC64; 

Query Match 8.4%; Score 121; DB 4; Length 171; 

Best Local Similarity 29.6%; Pred. No. 0.0035; 

Matches 42; Conservative 16; Mismatches 60; Indels 24; Gaps 7 

Qy 12 0 CKDP — KINDAT — QE- PVNCTN YTAH VSCFPAPNITCKDSSGNETHFTGN 165 

I : I I : : I I I I I : II | I : I I 
Db 40 CEDPVDHVGNATASQELGYGCLKFGGQAYSDVEHTSVQCHALDGIEC ASPRTFLREN 96 

Qy 166 EVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLI 225 

II I I : : : I I I I I I I I I I : : I hi II : 

Db 97 KP — CI KYTGHYFITTLLYS FFLGCFGVDRFCLGHTGTAVGKLLTLGGLGIWWFV 149 

Qy 226 DFILISMQIVGPSDGSSYIIDY 247 

III: : | | | | I : : I 

Db 150 DLILLITGGLMPSDGSNWCTVY 171 

Search completed: March 4, 2004, 10:26:50 
Job time : 83 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: March 4, 2004, 09:18:40 



Search time 48 Seconds 

(without alignments) 

291.810 Million cell updates/sec 



Title: US-09-852-100B-2 

Perfect score: 1439 
Sequence: 

Scoring table: 



1 MHILKGSPNVIPRAHGQKNT TRLTRLS ITNETFRKTQLYP 2 69 



BLOSUM62 
Gapop 10.0 , Gapext 0.5 



Searched: 141681 seqs, 52070155 residues 

Total number of hits satisfying chosen parameters; 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



141681 



Database : 



SwissProt 42:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 



No. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


167.5 


11. 6 


573 


1 


YKK3_CAEEL 


P34280 


caenorhabdi 


2 


90 


6.3 


1324 


1 


VGL2_CVMA5 


P11224 


murine coro 


3 


89.5 


6.2 


515 


1 


EF1S_P0RPU 


P50257 


porphyra pu 


4 


86.5 


6.0 


151 


1 


LCT2_MOUSE 


088803 


mus musculu 


5 


85.5 


5.9 


348 


1 


ADH2_KLULA 


P49383 


kluyveromyc 


6 


85 


5.9 


338 


1 


LAMP_RAT 


Q62813 


rattus norv 


7 


83.5 


5.8 


487 


1 


Y346 MYCTU 


006297 


mycobacteri 


8 


83 


5.8 


348 


1 


ADH1_KLUMA 


Q07288 


kluyveromyc 


9 


83 


5.8 


764 


1 


CFAB_HUMAN 


P00751 


homo sapien 


10 


82.5 


5.7 


338 


1 


LAMP_HUMAN 


Q13449 


homo sapien 


11 


81.5 


5.7 


493 


1 


GATA__RHIME 


Q92qk7 


rhizobium m 


12 


80.5 


5.6 


223 


1 


VG32 BPMD2 


064226 


mycobacteri 


13 


80.5 


5.6 


455 


1 


ENT 1_HUMAN 


Q99808 


homo sapien 


14 


80 


5.6 


328 


1 


IBP2_HUMAN 


P18065 


homo sapien 


15 


80 


5.6 


450 


1 


LIPP_PIG 


P00591 


sus scrofa 


16 


80 


5.6 


638 


1 


OARl_LYMST 


077408 


lymnaea sta 


17 


79.5 


5.5 


489 


1 


ANSP_MYCTU 


033261 


mycobacteri 


18 


79.5 


5.5 


521 


1 


GATA_RHILO 


Q98m95 


rhizobium 1 
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ALIGNMENTS 



RESULT 1 
YKK3_CAEEL 

ID YKK3_CAEEL STANDARD; PRT; 573 AA. 

AC P34280; 

DT 01-FEB-1994 (Rel. 28, Created) 

DT 01-FEB-1994 (Rel. 28, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Hypothetical GTP-binding protein C02F5.3 in chromosome III. 

GN C02F5.3. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID=6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bristol N2 ; 

RX MEDLINE=94150718; PubMed=7 906398 ; 

RA Wilson R., Ainscough R. , Anderson K., Baynes C, Berks M. , 

RA Bonfield J., Burton J., Connell M. , Copsey T . , Cooper J., Coulson A., 

RA Craxton M. , Dear S., Du Z., Durbin R. , Favello A., Fraser A., 

RA Fulton L., Gardner A., Green P., Hawkins T., Hillier L. , Jier M. , 

RA Johnston L., Jones M. , Kershaw J., Kirsten J., Laisster N . , 

RA Latreille P., Lightning J., Lloyd C, Mortimore B., O'Callaghan M. , 

RA Parsons J., Percy C, Rifken L., Roopra A., Saunders D., Shownkeen R., 

RA Sims M. , Smaldon N., Smith A., Smith M. , Sonnhammer E., Staden R. , 



RA Sulston J., Thierry-Mieg J. , Thomas K., Vaudin M. , Vaughan K., 

RA Waterston R. , Watson A., Weinstock L . , Wilkinson-Sproat J., 

RA Wohldman P . ; 

RT "2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 

RT elegans."; 

RL Nature 368:32-38(1994). 

CC -!- SIMILARITY: Belongs to the GTP1 / OBG family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; L14745; AAA27918.1; -. 

DR PIR; S44605; S44605. 

DR WormPep; C02F5.3; CE00039. 

DR InterPro; IPR006074; GTPl/OBG_dom. 

DR InterPro; IPR006073; GTP1_0BG. 

DR InterPro; IPR006169; GTPl_OBG_sub . 

DR InterPro; IPR005225; Small_GTP. 

DR InterPro; IPR004095; TGS_dom. 

DR InterPro; IPR007829; TM2 . 

DR Pfam; PF01018; GTPl_OBG; 1. 

DR Pfam; PF02824; TGS; 1. 

DR Pfam; PF05154; TM2; 1. 

DR PRINTS; PR00326; GTP10BG. 

DR TIGRFAMs; TIGR00231; small_GTP; 1. 

DR PROSITE; PS00905; GTP1_0BG; 1. 

KW Hypothetical protein; GTP-binding. 

FT NP_BIND 69 76 GTP (BY SIMILARITY) . 

FT NP_BIND 115 119 GTP (BY SIMILARITY) . 

FT NP_BIND 24 6 24 9 GTP (BY SIMILARITY) . 

SQ SEQUENCE 573 AA; 64299 MW; BA437D93C8 98B9AC CRC64 ; 



Query Match 11.6%; Score 167.5; DB 1; 

Best Local Similarity 27.9%; Pred. No. 1.7e-07; 
Matches 50; Conservative 23; Mismatches 57; 



Length 573; 
Indels 49; 



Gaps 



5; 



Qy 



Db 



90 VTTGPWGAVATSAGGEESLKCEDLKVGQYICKDP K 124 

I : I I I I : : I I : : I : I I I 

415 VSTNPLGPV VECRFLENS FILCEDPVPLYGPGQTGQQPANES FRNEGKCLK 465 



Qy 

Db 

Qy 

Db 



125 INDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVGFFKPISCRNVNGYSYK 184 

: I I I I II III II II I Ih: 

466 MGGYRAEDVEFTN VKCRVLPCIEC HGPRT FTKSTPCI I YNGHYFL 510 

185 VAVALSLFLGWLGADRFYLGYPALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSY 243 

: |:||| : III III |: : I hi II ::l |: : ::||:| II: 
511 TTLLYS I FLGWAVDRFCLGYSAMAVGKLMTLGGFGIWWI VDI FLLVLGVLGPADDSSW 569 



RESULT 2 
VGL2_CVMA5 

ID VGL2 CVMA5 STANDARD; PRT; 1324 AA. 



AC P11224; 

DT 01-JUL-1989 (Rel. 11/ Created) 

DT 01-FEB-1996 (Rel. 33, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE E2 glycoprotein precursor (Spike glycoprotein) (Peplomer protein) 

DE [Contains: Spike protein SI (90B); Spike protein S2 (90A)]. 

GN S. 

OS Murine coronavirus (strain A59) (MHV-A59) (Murine hepatitis virus). 

OC Viruses; ssRNA positive-strand viruses, no DNA stage; Nidovirales; 

OC Coronaviridae; Coronavirus. 

OX NCBI_TaxID=11142; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=88072088; PubMed-2 8254 19 ; 

RA Luytjes W., Sturman L.S., Bredenbeek P. J., Charite J., 

RA van der Zeijst B.A.M., Horzinek M.C., Spaan W.J.M.; 

RT "Primary structure of the glycoprotein E2 of coronavirus MHV-A59 and 

RT identification of the trypsin cleavage site. 11 ; 

RL Virology 161:479-487(1987). 

CC -!- FUNCTION: THE PEPLOMER PROTEIN MEDIATES THE BINDING OF VIRIONS 

CC TO THE HOST CELL RECEPTOR AND IS INVOLVED IN MEMBRANE FUSION 

CC AND IN SYNCYTIUM FORMATION. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M18379; AAA46455.1; -. 

DR InterPro; IPR002552; Corona_S2. 

DR Pfam; PF01601; Corona_JS2; 1. 

KW Glycoprotein; Envelope protein; 



Transmembrane; Signal. 



FT 


SIGNAL 


1 


16 








FT 


CHAIN 


17 


1324 


E2 GLYCOPROTEIN. 




FT 


CHAIN 


17 


717 


SPIKE PROTEIN Si. 




FT 


CHAIN 


718 


1324 


SPIKE PROTEIN S2 . 




FT 


DOMAIN 


17 


1265 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1266 


1286 


POTENTIAL 






FT 


DOMAIN 


1287 


1324 


CYTOPLASMIC (POTENTIAL) 




FT 


DOMAIN 


1287 


1304 


CYS-RICH. 






FT 


CARBOHYD 


31 


31 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


60 


60 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


192 


192 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


357 


357 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


435 


435 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


530 


530 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


625 


625 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


657 


657 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


665 


665 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


688 


688 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


737 


737 


N-LINKED 


(GLCNAC. . .) 


( POTENTIAL) . 


FT 


CARBOHYD 


754 


754 


N-LINKED 


(GLCNAC. . . ) 


(POTENTIAL) . 


FT 


CARBOHYD 


893 


893 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 



FT 


CARBOHYD 


1180 


1180 


N- 


LINKED 


(GLCNAC . 


. . ) (POTENTIAL). 


FT 


CARBOHYD 


1190 


1190 


N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) . 


FT 


CARBOHYD 


1209 


1209 


N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) . 


FT 


CARBOHYD 


1225 


1225 


N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) . 


FT 


CARBOHYD 


1246 


1246 


N- 


LINKED 


(GLCNAC. 


. . ) (POTENTIAL) . 


SQ 


SEQUENCE 


1324 


AA; 146019 


MW; 


E198EF8F0BCDBF0E CRC64; 


Query Match 




6.3%; 


Score 90; 


DB 1; 


Length 1324; 



Best Local Similarity 23.6%; Pred. No. 5; 

Matches 59; Conservative 27; Mismatches 94; Indels 70; Gaps 13; 

Qy 25 TGLYPMRG-PFKNLALLP FSLPLLGGGGSGSGEKVSVSKMAAAWPSGPSA — 73 

1111:11:1111 III I I I : : II I : I 

Db 66 TGYYPVDGSKFRNLALRGTNS VSLSWFQP P YLNQFNDGI FAK — VQNLKT ST P S GATAYF 123 

Qy 74 PEAVT ARL VGVLW FVS VT T G P WGAVAT S AGG EES LKC E DLKVGQ Y- 1 C KD P K I N DATQ E P 132 

II II : : I | : I : : I I I I I : I 
Db 124 PTIVIGSLFGYTSY-TWIEPYNGVIMAS VCQYTICQLP 161 

Qy 133 WCTNYTAHVSCFPAPNITCKDSSGNETHFTGNE-VGFF KPISCRNVNGYSYKVAV 187 

: I I I I I : : I I : III -I 

Db 162 YTDCKPNTN GNKLIGFWHTDVKPPICVLKRNFTLNVNA 199 

Qy 188 ALSLFLGWLGADRFYLGY PALGLLKFCTVGFCGIGSLIDFILISMQIVGPSDGSSYI 244 

I : Ml I : I II:: : I I : I I : : 

Db 200 DAFYFHFYQHGGTFYAYYADKPSATTFLFSVY IGDILTQYYVLPFICNPTAGSTFA 255 

Qy 245 IDYYGTRLTR 254 

I : I I : 

Db 256 PRYWVTPLVK 265 



RESULT 3 
EFlS_PORPU 

ID EFlS_PORPU STANDARD; PRT; 515 AA. 

AC P50257; 

DT 01-OCT-1996 (Rel. 34, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 15-DEC-1998 (Rel. 37, Last annotation update) 

DE Elongation factor 1-alpha S (EF-l-alpha S) (Sporophyte-specif ic EF-1- 

DE alpha) . 

GN TEF-S. 

OS Porphyra purpurea. 

OC Eukaryota; Rhodophyta; Bangiophyceae; Bangiales; Bangiaceae; Porphyra. 

OX NCBI_TaxID=2787; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Avonport; 

RX MEDLINE=96309386; PubMed=87 04 161 ; 

RA Liu Q.Y., Baldauf S.L., Reith M.E.; 

RT "Elongation factor 1 alpha genes of the red alga Porphyra purpurea 

RT include a novel, developmentally specialized variant."; 

RL Plant Mol. Biol. 31:77-85(1996). 

CC -!- FUNCTION: This protein promotes the GTP-dependent binding of 
CC aminoacyl-tRNA to the A-site of ribosomes during protein 

CC biosynthesis. 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic. 



CC -!- DEVELOPMENTAL STAGE: EXPRESSED ONLY IN THE SPOROPHYTE, A SHELL- 

CC BORING, FILAMENTOUS PHASE . 

CC -!- SIMILARITY: Belongs to the GTP-binding elongation factor family. 

CC EF-Tu/EF-IA subfamily. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

cc 

DR EMBL; U08841; AAA61790.1; -. 

DR HSSP; P07157; 1AIP. 

DR InterPro; IPR004539; EFl_alpha. 

DR InterPro; IPR000795; EF_GTPbind. 

DR InterPro; IPR004160; EFTU_Cterm. 

DR InterPro; IPR004161; EFTU_D2 . 

DR InterPro; IPR009001; Elong_init_C . 

DR InterPro; IPR009000; Translat_f actor . 

DR Pfam; PF00009; GTP_EFTU; 1. 

DR Pfam; PF03144; GTP_EFTU_D2 ; 1. 

DR Pfam; PF03143; GTP_EFTU__D3 ; 1. 

DR PRINTS; PR00315; ELONGATNFCT. 

DR TIGRFAMs; TIGR00483; EF-l_alpha; 1. 

DR PROSITE; PS00301; EFACTOR_GTP; 1. 

KW .Elongation factor; Protein biosynthesis; GTP-binding; 

KW Multigene family. 

FT NP_BIND 14 21 GTP (BY SIMILARITY) . 

FT NP_BIND 91 95 GTP (BY SIMILARITY) . 

FT NP_BIND 151 154 GTP (BY SIMILARITY) . 

SQ SEQUENCE 515 AA; 56648 MW; EBA03F4 02 9F62350 CRC64; 

Query Match 6.2%; Score 89.5; DB 1; Length 515; 
Best Local Similarity 24.7%; Pred. No. 1.8; 

Matches 37; Conservative 20; Mismatches 56; Indels 37; Gaps 6; 

Qy 41 PFSLPL LGGGGSGSGEKVSVSKMAAAW P S G P S AP EAVT ARL VGVLW FVS V 90 

I I I I : I I I : : I : I | : | : | | : : 

Db 261 PLRLPLQDVYKIGGIGTVPVGRVETGILKAGMQVTFEPAGKAAVEVKSVEM HH 313 

Qy 91 TTGPWGAVAT SAGGEES LKCEDLKVGQYI CKDPK INDATQEPVNCTN — 137 

I : I : I | :|:| | :| I I I |: I I 

Db 314 TSVPQAI PGDNVGFNVKLTVKDIKRGD-VCGDTKNDPPI PTECFLANVI IQDHICNIRNGY 372 

Qy 138 YT AH VS C F P APN I T C KD S S GN ET H 161 

:|||::| I :: II I :|| 

Db 373 TPVLDCHTAHIACKFASILSKKDKRGKQTH 4 02 



RESULT 4 
LCT2_MOUSE 

ID LCT2_MOUSE STANDARD; PRT; 151 AA. 

AC 088803; O88804; Q9QWN3; Q9Z337; 

DT 15-JUL-1999 (Rel. 38, Created) 

DT 15-JUL-1999 (Rel. 38, Last sequence update) 



DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Leukocyte cell-derived chemotaxin 2 precursor (Chondromodulin II) 

DE (ChM-II) . 

GN LECT2 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2), AND VARIANT VAL-129. 

RC STRAIN=BALB/c; TISSUE=Liver ; 

RX MEDLINE-98382586; PubMed=97 147 93 ; 

RA Yamagoe S., Watanabe T., Mizuno S., Suzuki K. ; 

RT "The mouse Lect2 gene: cloning of cDNA and genomic DNA, structural 

RT characterization and chromosomal localization."; 

RL Gene 216:171-178(1998). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM 1) . 

RC STRAIN=Swiss Webster / NIH; TISSUE=Embryo, and Liver; 

RX MEDLINE=99160594; PubMed=10050029 ; 

RA Shukunami C, Kondo J., Wakai H., Takahashi K., Inoue H . , Kamizono A. , 

RA Hiraki Y. ; 

RT "Molecular cloning of mouse and bovine chondromodulin- I I cDNAs and the 

RT growth-promoting actions of bovine recombinant protein."; 

RL J. Biochem. 125:436-442(1999). 

CC -!- FUNCTION: Has a neutrophil chemotactic activity. Also a positive 
CC regulator of chondrocyte proliferation. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC - ! - ALTERNATIVE PRODUCTS : 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l ; Synonyms=LECT2 ; 

CC Isold=08 88 03-1; Sequence=Displayed; 

CC Name-2; Synonyms=LECT2Q; 

CC IsoId=088803-2; Sequence=VSP_003051 ; 

CC -!- TISSUE SPECIFICITY: Highly expressed in liver and weakly in 
CC testis. Not expressed in heart, brain, spleen, lung, skeletal 

CC muscle and kidney. 

CC -!- SIMILARITY: Belongs to the LECT2 / MIM-1 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



CC 

DR EMBL; AB009687; BAA33383.1; -. 

DR EMBL; AB009688; BAA33384.1; -. 

DR EMBL; AB009689; BAA33385.1; 

DR EMBL; AB009689; BAA33386.1; 

DR EMBL; AF035161; AAF13302.1; -. 

DR MGD; MGI: 1278342; Lect2. 

DR InterPro; IPR008663; LECT2 . 

DR Pfam; PF05429; LECT2; 1. 

KW Chemotaxis; Signal; Alternative splicing. 

FT SIGNAL 1 18 BY SIMILARITY. 



FT CHAIN 19 151 LEUKOCYTE CELL-DERIVED CHEMOTAXIN 2. 

FT VARSPLIC 98 151 FCVKIFYIKPIKYKGSIKKGEKLGTLLPLQKIYPGIQSHVH 

FT VENCDSSDPTAYL -> QRLQAHTTTLNVFTCYWDKIQI PR 

FT PTRFLCQNFLH (in isoform 2). 

FT / FT I d=VS P__0 0 3 0 5 1 . 

FT VARIANT 129 129 I -> V. 

SQ SEQUENCE 151 AA; 16405 MW; 18AF444046B7AE8E CRC64; 

Query Match 6.0%; Score 8 6.5; DB 1; Length 151; 

Best Local Similarity 24.8%; Pred. No. 0.75; 

Matches 29; Conservative 9; Mismatches 52; Indels 27; Gaps 4; 

Qy 78 TARLVGVLWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYICKDPKINDATQEPVNCTN 137 

II: I I I I : I I I : III : II 
Db 4 TTILISAALLSSALAGPWANICASKSSNEIRTCDSYGCGQYSAQ RTQR 51 

Qy 138 YT AH VS C F P APN I T C KD S S GN ET H FT GN E VG FFK P ISC RNV NGYSYKV 185 

I | :: | | I I I I I I I I : I I : I : 

Db 52 H HPGVDVLCSDGSWYAPFTGKIVGQEKPYRNKNAINDGIRLSGRGFCVKI 102 



RESULT 5 
ADH2 KLULA 



ID ADH2_KLULA STANDARD; PRT; 348 AA. 

AC P49383; 

DT 01-FEB-1996 (Rel. 33, Created) 

DT 01-FEB-1996 (Rel. 33, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Alcohol dehydrogenase II (EC 1.1.1.1). 

GN ADH2 . 

OS Kluyveromyces lactis (Yeast) . 

OC Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes ; 

OC Saccharomycetales; Saccharomycetaceae ; Kluyveromyces. 

OX NCBIJTaxID-28985; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=CBS 2359 / IFO 1267 / NRRL Y-1140 / WM37; 

RX MEDLINE=92269769; PubMed=1588 917 ; 

RA Shain D.H., Salvadore C, Denis C.L.; 

RT "Evolution of the alcohol dehydrogenase (ADH) genes in yeast: 

RT characterization of a fourth ADH in Kluyveromyces lactis."; 

RL Mol. Gen. Genet. 232:479-488(1992). 

CC -!- CATALYTIC ACTIVITY: An alcohol + NAD (+) = an aldehyde or ketone + 
CC NADH . 

CC -!- COFACTOR: Binds 2 zinc ions per subunit (By similarity). 

CC -!- SUBUNIT: Homotetramer (By similarity). 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic (By similarity). 

CC -!- SIMILARITY: Belongs to the zinc-containing alcohol dehydrogenase 

CC family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



cc 












DR 


EMBL; X64397; CAA45739.1; - 






DR 


PIR; S20911; S20911. 






DR 


InterPro; 


IPR002328; ADH zinc. 




DR 


InterPro ; 


IPR00208 


5; Adh zn 


family . 




DR 


Pfam; PF00107; ADH 




i . 




DR 


PROSITE; 


PS00059; 


ADH ZINC; 


1. 




KW 


Oxidoreductase; Zinc; Metal 


-binding; 


NAD; Multigene family. 


FT 


METAL 


44 


44 


ZINC 1 


(CATALYTIC) (BY SIMILARITY) , 


FT 


METAL 


67 


67 


ZINC 1 


(CATALYTIC) (BY SIMILARITY) , 


FT 


METAL 


98 


98 


ZINC 2 


(BY SIMILARITY) . 


FT 


METAL 


101 


101 


ZINC 2 


(BY SIMILARITY) . 


FT 


METAL 


104 


104 


ZINC 2 


(BY SIMILARITY) . 


FT 


METAL 


112 


112 


ZINC 2 


(BY SIMILARITY) . 


FT 


METAL 


154 


154 


ZINC 1 


(CATALYTIC) (BY SIMILARITY) , 


SQ 


SEQUENCE 


348 AA; 


37097 MW; F3B64AE1F520689C CRC64; 



Query Match 5.9%; Score 85.5; DB 1; Length 348; 

Best Local Similarity 19.9%; Pred. No. 2.5; 

Matches 65; Conservative 35; Mismatches 111; Indels 115; Gaps 13; 

Qy 19 NTRRDG TGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSVSKMAAAW 67 

I : I ||: : I : II I I I : I I I : I I :: : I 

Db 37 NVKYSGVCHTDLHAWKGDWP LPTKLPLV-GGHEGAGVWAMGENVKGWI I GDFAGI 91 



Qy 68 PSGPSAPEAVT 78 

I I I 

Db 92 KWLNGSCMSCEYCELSNESNCPDADLSGYTHDGSFQQYATADAVQAARIPKGTDLAEVAP 151 

Qy 7 9 ARLVGV LW FVS VT T G P WGAVAT S AGGE E S L KCE D L KV 115 

II I : | I I : : : I I II : I 

Db 152 I LC AGVTVY KAL K S ADL KAGDWVAI S GAC GGL G S LAI Q YAKAMG YRVLG IDT GAEKAKL F 211 

Qy 116 GQYICKDPKINDATQEPVNCTNYTAH VSCFPAPNITCKDSSGNETHFTGN 165 

I : I I : I I : I I I I III : I I I 

Db 212 KELGGEYFVDYAVSKDLIKEIVDATNGGAHGVINVSVSEFAI EQSTNYVRSNGT 265 

Qy 166 EVGFFKPISCRNVNGYSYKVAVALSLFLGWLG — AD-RFYLGYPALGLLKFCTVGFCGIG 222 

I I : : : I : : I : :: I I I I I : I I I : : I : 

Db 266 WLVGLPRDAKCKSDVFTQWKSVSIVGSYVGNRADTREALDFFARGLV-HAPIKIVGLS 324 



Qy 223 SLIDFI— LISMQIVGPSDGSSYIID 246 

II : : : I I I I : : I 

Db 325 ELADVYDKMVKGEIVG RYWD 34 5 



RESULT 6 
LAM P_ RAT 

ID LAMP_RAT STANDARD; PRT; 338 AA. 

AC Q62813; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Limbic system-associated membrane protein precursor (LSAMP) . 

GN LSAMP OR LAMP. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus . 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 29-49. 

RC TISSUE=Hippocampus ; 

RX MEDLINE=95374785; PubMed=7 64 6886; 

RA Pimenta A.F., Zhukareva V., Barbe M.F., Reinoso B.S., Grimley C, 

RA Henzel W. , Fischer I., Levitt P.; 

RT "The limbic system-associated membrane protein is an Ig superfamily 

RT member that mediates selective neuronal growth and axon targeting."; 

RL Neuron 15:287-297(1995). 

cc _i_ FUNCTION : MEDIATES SELECTIVE NEURONAL GROWTH AND AXON TARGETING. 
CC CONTRIBUTES TO THE GUIDANCE OF DEVELOPING AXONS AND REMODELING OF 

CC MATURE CIRCUITS IN THE LIMBIC SYSTEM. ESSENTIAL FOR NORMAL GROWTH 

CC OF THE HYPPOCAMPAL MOSSY FIBER PROJECTION. 

CC -!- SUBCELLULAR LOCATION: Attached to the membrane by a GPI-anchor. 

CC -!- TISSUE SPECIFICITY: Expressed mostly by neurons comprising limbiC- 
CC associated cortical and subcortical regions that function in 

CC cognition, emotion, memory, and learning. 

CC -!- DEVELOPMENTAL STAGE: FIRST DETECTED AT E15-16, AT STAGE E20 IT IS 

CC DETECTED IN PRESUMPTIVE CORTEX, MEDIAL LIMBIC AREAS OF THE 

CC THALAMUS AND HYPOTHALAMUS. IN THE ADULT, IT IS FOUND IN 

CC HYPOTHALAMUS , PERIRHINAL CORTEX, AMYGDALA AND MEDIAL THALAMIC 

CC REGION. 

CC -!- SIMILARITY: Belongs to the immunoglobulin superfamily. IgLON 
CC family. 

CC -!- SIMILARITY: Contains 3 iiranuno globulin- like C2-type domains. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 



CC 














DR 


EMBL; U31554; AAA86120.1; -. 






DR 


InterPro; 


IPR007110; Ig-like 






DR 


InterPro; 


IPR003598; Ig_c2 . 






DR 


Pfam; PF00047; ig; 


3. 






DR 


SMART; SM00408; 


IGc2; 2. 






DR 


PROSITE; 


PS50835; 


IG LIKE; 3 






KW 


Immunoglobulin domain; Cell 


adhesion; Glycoprotein; 


GPI-anchor; 


KW 


Repeat; Signal; 


Lipoprotein . 






FT 


SIGNAL 


1 




28 






FT 


CHAIN 


29 




315 


LIMBIC SYSTEM-ASSOCIATED MEMBRANE 


FT 










PROTEIN. 




FT 


PROPEP 


316 




338 


REMOVED IN MATURE FORM 


(POTENTIAL) 


FT 


DOMAIN 


29 




122 


IG-LIKE C2-TYPE 1. 




FT 


DOMAIN 


132 




214 


IG-LIKE C2-TYPE 2. 




FT 


DOMAIN 


219 




304 


IG-LIKE C2-TYPE 3. 




FT 


DISULFID 


53 




111 


POTENTIAL. 




FT 


DISULFID 


153 




197 


POTENTIAL. 




FT 


DISULFID 


239 




290 


POTENTIAL. 




FT 


CARBOHYD 


40 




40 


N-LINKED (GLCNAC. . . ) 


(POTENTIAL) 


FT 


CARBOHYD 


66 




66 


N-LINKED (GLCNAC. . .) 


( POTENTIAL) 


FT 


CARBOHYD 


136 




136 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) 



T7i rp 


LAKrSUn I JJ 


1 A ft 
_L *i O 


14ft 

11 o 


N-LINKED (GLCNAC. 


. .) (POTENTIAL) 


FT 


LAKiJUn I JJ 




01 Q 


N-LINKED (GLCNAC. 


. . ) (POTENTIAL) 


FT 


CARBOHYD 


287 


287 


N-LINKED (GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


300 


300 


N-LINKED (GLCNAC. 


. .) (POTENTIAL) 


FT 


CARBOHYD 


315 


315 


N-LINKED (GLCNAC. 


. .) (POTENTIAL) 


FT 


LIPID 


315 


315 


GPI-anchor amidated asparagine 


FT 








(Potential) . 




SQ 


SEQUENCE 


338 AA; 


37324 MW 


0B76AFDD68A39BB6 


CRC64; 


Query Match 




5.9%; 


Score 85; DB 1; 


Length 338; 



Best Local Similarity 27.7%; Pred. No. 2.7; 
Matches 36; Conservative 15; Mismatches 47; Indels 32; Gaps 7 

Qy 101 SAGGEESLKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: I : Mill I :: I Ml : I I |:| :: 

Db 230 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE GQSSLTVTNVT-EEHY 285 

Qy 157 GNETHFTGNEVG FFKPISCRIfVNGYSYKVAVALSLFLGWLGADRFYLGYPALG 209 

I I I MM I M I I M I I M I I II I 

Db 286 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL WL LA 328 

Qy 210 LLKFCTVGFC 219 

II: I 

Db 329 ASLFCLLSKC 338 



RESULT 7 
Y34 6_MYCTU 

ID Y34 6_MYCTU STANDARD; PRT; 4 87 AA. 

AC 006297; 

DT 15-JUL-1998 (Rel. 36, Created) 

DT 15-JUL-1998 (Rel. 36, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Hypothetical transport protein Rv0346c/MT0361/Mb0354c . 

GN RV0346C OR MT0361 OR MTCY13E10 . 06C OR MB0354C. 

OS Mycobacterium tuberculosis, and 

OS Mycobacterium bovis. 

OC Bacteria; Actinobacteria; Actinobacteridae; Actinomycetales ; 

OC Corynebacterineae ; Mycobacteriaceae ; Mycobacterium. 

OX NCBI_TaxID=1773, 1765; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC SPECIES=M. tuberculosis ; STRAIN=H37Rv; 

RX MEDLINE=98295987; PubMed=96342 30 ; 

RA Cole S.T., Brosch R., Parkhill J., Gamier T., Churcher C, Harris D., 

RA Gordon S.V., Eiglmeier K., Gas S., Barry C.E. Ill, Tekaia F. , 

RA Badcock K. , Basham D. , Brown D., Chillingworth T., Connor R. , 

RA Davies R. , Devlin K., Feltwell T., Gentles S., Hamlin N., Holroyd S. f 

RA Hornsby T., Jagels K., Krogh A., McLean J., Moule S., Murphy L., 

RA Oliver S. f Osborne J., Quail M.A. , Rajandream M.A. , Rogers J., 

RA Rutter S., Seeger K., Skelton S., Squares S., Squares R. , 

RA Sulston J.E., Taylor K., Whitehead S., Barrell B.G.; 

RT "Deciphering the biology of Mycobacterium tuberculosis from the 

RT complete genome sequence."; 

RL Nature 393:537-544(1998). 

RN [2] 

RP SEQUENCE FROM N.A. 



RC SPECIES^M. tuberculosis; STRAIN=CDC 1551 / Oshkosh; 

RX MEDLINE=22206494; PubMed=12218036; 

RA Fleischmann R.D., Alland D., Eisen J. A., Carpenter L., White 0., 

RA Peterson J., DeBoy R. , Dodson R. , Gwinn M. , Haft D., Hickey E., 

RA Kolonay J.F., Nelson W.C., Umayam L.A., Ermolaeva M. , Salzberg S.L., 

RA Delcher A. , Utterback T., Weidman J., Khouri H., Gill J. , Mikula A. , 

RA Bishai W., Jacobs W.R. Jr., Venter J.C., Fraser CM. ; 

RT "Whole-genome comparison of Mycobacterium tuberculosis clinical and 

RT laboratory strains."; 

RL J. Bacterid. 184:5479-5490(2002). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC SPECIES=M.bovis; STRAIN=AF2122/97 ; 

RX MEDLINE=22709107; PubMed=127 8 8972 ; 

RA Gamier T., Eiglmeier K. , Camus J.-C, Medina N., Mansoor H., 

RA Pryor M. , Duthoy S., Grpndin S., Lacroix C, Monsempe C. , Simon S., 

RA Harris B. , Atkin R. , Doggett J., Mayes R. , Keating L. , Wheeler P.R., 

RA Parkhill J., Barrell B.G., Cole S.T., Gordon S.V., Hewinson R.G.; 

RT "The complete genome sequence of Mycobacterium bovis." ; 

RL Proc. Natl. Acad. Sci. U.S.A. 100:7877-7882(2003). 

CC -!- FUNCTION: Probable amino-acid or metabolite transport protein. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein (Potential). 

CC -!- SIMILARITY: Belongs to the amino acid permease family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; Z95324; CAB08578.1; -. 

DR EMBL; AE006942; AAK44583.1; -. 

DR EMBL; BX248335; CAD93217.1; 

DR PIR; C70574; C70574. 

DR TIGR; MT0361; -. 

DR TubercuList; Rv0346c; -. 

DR InterPro; IPR002293; AA/rel_permeasel . 

DR InterPro; IPR004840; AAc_permease . 

DR InterPro; IPR004841; Permease_region . 

DR Pfam; PF00324; aa_permeases ; 1. 

DR PROSITE; PS00218; AMINO_ACID_PERMEASE_l ; 1. 

KW Hypothetical protein; Transport; Amino-acid transport; Transmembrane; 



KW 


Complete 


proteome . 






FT 


TRANS MEM 


26 


46 


POTENTIAL. 


FT 


TRANSMEM 


50 


70 


POTENTIAL. 


FT 


TRANSMEM 


98 


118 


POTENTIAL. 


FT 


TRANSMEM 


133 


153 


POTENTIAL. 


FT 


TRANSMEM 


163 


183 


POTENTIAL. 


FT 


TRANSMEM 


214 


234 


POTENTIAL. 


FT 


TRANSMEM 


256 


276 


POTENTIAL. 


FT 


TRANSMEM 


290 


310 


POTENTIAL. 


FT 


TRANSMEM 


341 


361 


POTENTIAL. 


FT 


TRANSMEM 


369 


389 


POTENTIAL. 


FT 


TRANSMEM 


414 


434 


POTENTIAL. 


FT 


TRANSMEM 


440 


460 


POTENTIAL. 



SQ SEQUENCE 487 AA; 52194 MW; 3572502DB6ACD987 CRC64; 



Query Match 5.8%; Score 83.5; DB 1; Length 487; 

Best Local Similarity 26.5%; Pred. No. 5.8; 

Matches 27; Conservative 16; Mismatches 40; Indels 19; Gaps 5; 

Qy 159 ETHFTGNEVGFFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALGLLKFCTVGF 218 

: I : I : I : I : : I : I I I I I II I I 

Db 8 DERLTREDTGYHKGLHSRQLQMIALGGAI GTGLFLG — AGGRLAS AG P GL FLVYGI 61 

Qy 219 CGIGSLIDFILISMQIVG PSDGS — SYIIDYYGTRL 252 

III | : : : : : | II II II : : I I : : 

Db 62 CGI FVFLILRALGELVLHRPSSGSFVSYAREFYGEKV 98 



RESULT 8 
ADH1 KLUMA 



ID ADH1_KLUMA STANDARD; PRT; 348 AA. 

AC Q07288; 

DT 01-FEB-1995 (Rel. 31, Created) 

DT 01-FEB-1995 (Rel. 31, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Alcohol dehydrogenase 1 (EC 1.1.1.1). 

GN ADH1. 

OS Kluyveromyces marxianus (Yeast) (Kluyveromyces fragilis) . 

OC Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes ; 

OC Saccharomycetales; Saccharomycetaceae; Kluyveromyces. 

OX NCBI_TaxID=4911; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=ATCC 12424; 

RX MEDLINE= 93250057; PubMed= 84 85163; 

RA Ladriere J.M. , Delcour J., Vandenhaute J.; 

RT "Sequence of a gene coding for a cytoplasmic alcohol dehydrogenase 

RT from Kluyveromyces marxianus ATCC 12424."; 

RL Biochim. Biophys . Acta 1173:99-101(1993). 

CC -!- CATALYTIC ACTIVITY: An alcohol + NAD ( + ) = an aldehyde or ketone + 
CC NADH . 

CC -!- COFACTOR: Binds 2 zinc ions per subunit (By similarity). 

CC - ! - SUBUNIT : Homotetramer . 

CC SUBCELLULAR LOCATION: Cytoplasmic. 

CC -!- SIMILARITY: Belongs to the zinc-containing alcohol dehydrogenase 
CC family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X60224; CAA42785.1; -. 

DR PIR; S32521; S32521. 

DR InterPro; IPR002328; ADH_zinc. 

DR InterPro; IPR002085; Adh_zn_f amily . 

DR Pfam; PF00107; ADH_zinc_N; 1. 



DR 


PROSITE; 


PS00059; 


ADH ZINC; 


x . 


KW 


Oxidoreductase; Zinc; Metal- 


ijjLliuxiiy / in/tJ-^ / nui Liyciic -La.iLi._L.L_y ■ 


FT 


METAL 


44 


44 


7 TTtfr* 1 I PZ-.TZXT.YTTP ^ /RV ^ITMTT.ARTTYl 
/jjL-NU -L ^ v--/\1_*t_Li I i XL/ J (,131 oll v UJjru\I J. I J 


FT 


METAL 


67 


67 


_jXJNU X \LA1/UjI ilLJ \di oXIYLX XlHX\X 1 I ) 


FT 


METAL 


98 


98 


ZINC 2 (BY SIMILARITY) . 


FT 


METAL 


101 


101 


ZINC 2 {BY SIMILARITY) . 


FT 


METAL 


104 


104 


ZINC 2 (BY SIMILARITY) . 


FT 


METAL 


112 


112 


ZINC 2 (BY SIMILARITY) . 


FT 


METAL 


154 


154 


ZINC 1 (CATALYTIC) (BY SIMILARITY) 


SQ 


SEQUENCE 


348 AA; 


37158 MW; A75D2EBE82E355BD CRC64; 


Query Match 




5.8%; 


Score 83; DB 1; Length 348; 



Best Local Similarity 20.8%; Pred. No. 4.3; 

Matches 65; Conservative 38; Mismatches 107; Indels 102; Gaps 14; 

Qy 19 NTRRDG TGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSVSKMAAAWPSG 70 

I : I ||: : I : I 111= II I : I I : : : I I 

Db 37 NVKYSGVCHTDLHAWQGDWP LDTKLPLV-GGHEGAGIWAMGENVTGWEIGDYAGI 91 

Qy 71 PSAPEA VTARLV 82 

I : I : I III 
Db 92 KWLNGSCMSCEECELSNEPNCPKADLSGYTHDGSFQQYATADAVQAARIPKNVDLAEVAP 151 

Qy 83 GV LWFVSVTTGPWGAVATSAGGEESLKCEDLKV 115 

II I : | | | : : : I I I I : I 

Db 152 ILCAGVTWKALKSAHIKAGDWVAISGACGGLGSLA^ 211 

Qy 116 GQYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNE — VGF 169 

I : I II I : I I I I : : : I I I II 

Db 212 KELGGEYFIDFTKTKDMVAEVI1^TNGVAHAVI^^V P SVSE_AAISTSVLYTRSNGTVAALVGL 271 

Qy 170 FKPISCRNVNGYSYKVAVALSLFLGWLG — AD-RFYLGYPALGLLK — FCTVGFCGIGSL 224 

: I : : : I : : I : : : I III I : : I I : I : I : I : 

Db 272 PRDAQCK--SDVFNQWKSISIVGSWGNRADTREALDFFSRGLVKAPIKILGLSE1^^ 329 

Qy 225 IDFILISMQIVG 236 

I :: I I I I 
Db 330 YD- KMVKGQI VG 340 



RESULT 9 
CFAB_HUMAN 

ID CFAB_HUMAN STANDARD; PRT; 764 AA. 

AC P00751; 015006; Q29944; Q96HX6; Q9BTF5; Q9BX92; 

DT 21-JUL-1986 (Rel. 01, Created) 

DT 01-OCT-1994 (Rel. 30, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Complement factor B precursor (EC 3.4.21.47) (C3/C5 convertase) 

DE (Properdin factor B) (Glycine-rich beta glycoprotein) (GBG) (PBF2) . 

GN BF. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 
RN [1] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND VARIANTS ARG-28; GLN-28; GLN-32 

RP AND SER-736. 



RX MEDLINE-91065702; PubMed=2249879 ; 

RA Davrinche C, Abbal M. , Clerc A.; 

RT "Molecular characterization of human complement factor B subtypes."; 

RL Immunogenetics 32:309-312(1990). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND VARIANTS ARG-28 AND GLN-32 . 

RC TISSUE=Liver; 

RX MEDLINE=94237735; PubMed=8 18 1962 ; 

RA Mejia J.E., Jahn I., de la Salle H., Hauptmann G. ; 

RT "Human factor B. Complete cDNA sequence of the BF*S allele."; 

RL Hum. Immunol. 39:49-53(1994). 

RN [3] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND VARIANTS ARG-28 AND GLN-32. 

RC TISSUE=Liver; 

RX MEDLINE=94041399; PubMed=8225386; 

RA Schwaeble W., Luettig B., Sokolowski T . , Estaller C, Weiss E.H., 

RA Meyer Zum Bueschenf elde K.-H., Whaley K. , Dippold W.; 

RT "Human complement factor B: functional properties of a recombinant 

RT zymogen of the alternative activation pathway convertase. "; 

RL Immunobiology 188:221-232(1993). 

RN [4] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND VARIANTS ARG-28 AND GLN-32. 

RX MEDLINE=94067177; PubMed=8247 029 ; 

RA Horiuchi T., Kim S., Matsumoto M. , Watanabe I., Fujita S., 

RA Volanakis J.E.; 

RT "Human complement factor B: cDNA cloning, nucleotide sequencing, 

RT phenotypic conversion by site-directed mutagenesis and expression."; 

RL Mol. Immunol. 30:1587-1592(1993). 

RN [5] 

RP SEQUENCE FROM N.A. 

RA Rowen L., Dankers C, Baskin D., Faust J., Loretz C, Ahearn M.E., 

RA Banta A., Swartzell S., Smith T.M., Spies T., Hood L. ; 

RT "Sequence determination of 300 kilobases of the human class III MHC 

RT locus."; 

RL Submitted (SEP-1997) to the EMBL/ GenBank/DDBJ databases. 

RN [6] 

RP SEQUENCE FROM N.A. (ISOFORM 2). 

RA Jaatinen T., Kanerva J., Poutanen K.E., Saarinen-Pihkala U., 

RA Lokki M.-L. ; 

RT "Expression and alternative splicing of human factor B gene in 

RT leukemic mononuclear cells."; 

RL Submitted (FEB-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [7] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND VARIANTS HIS-9; GLN-32; TRP-32; 

RP SER-252; GLU-565 AND GLU-651. 

RA Rieder M.J., Carrington D.P., Hastings N.C., Ahearn M.O., 

RA Kuldanek S.A., Rajkumar N., Toth E.J., Yi Q. , Nickerson D.A. ; 

RL Submitted (NOV-2002) to the EMBL/ GenBank/DDBJ databases. 

RN [8] 

RP SEQUENCE FROM N.A. (ISOFORM 1), AND VARIANT TRP-32. 

RC TISSUE=Colon; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G. D. f 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F. , 

RA Diatchenko L., Marusina K. , Farmer A. A., Rubin G.M., Hong L., 



RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A. , McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M. f Madan A., Rodrigues S., Sanchez A. , 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [9] 

RP SEQUENCE OF 26-764, PARTIAL SEQUENCE FROM N.A. , AND CARBOHYDRATES. 

RX MEDLINE=84161997; PubMed=6546754 ; 

RA Mole J.E., Anderson J.K., Davison E.A., Woods D.E.; 

RT "Complete primary structure for the zymogen of human complement 

RT factor B."; 

RL J. Biol. Chem. 259:3407-3412(1984). 

RN [10] 

RP SEQUENCE OF 260-764. 

RX MEDLINE=83204002; PubMed=6342610 ; 

RA Christie D.L., Gagnon J.; 

RT "Amino acid sequence of the Bb fragment from complement Factor B. 

RT Sequence of the major cyanogen bromide-cleavage peptide (CB-II) and 

RT completion of the sequence of the Bb fragment."; 

RL Biochem. J. 209:61-70(1983). 

RN [11] 

RP SEQUENCE OF 339-764 FROM N.A. 

RX MEDLINE=83273641; PubMed=6308626; 

RA Campbell R.D., Porter R.R.; 

RT "Molecular cloning and characterization of the gene coding for human 

RT complement protein factor B."; 

RL Proc. Natl. Acad. Sci. U.S.A. 80:4464-4468(1983). 

RN [12] 

RP SEQUENCE OF 467-595 AND 752-764 FROM N.A. 

RX MEDLINE=83039428; PubMed=6957884 ; 

RA Woods D.E., Markham A.F., Ricker A.T., Goldberger G. , Colten H.R.; 

RT "Isolation of cDNA clones for the human complement protein factor B, 

RT a class III major histocompatibility complex gene product."; 

RL Proc. Natl. Acad. Sci. U.S.A. 79:5661-5665(1982). 

RN [13] 

RP SEQUENCE OF 16-259 FROM N.A. 

RX MEDLINE=84 158524; PubMed=6323161 ; 

RA Morley B.J., Campbell R.D.; 

RT "Internal homologies of the Ba fragment from human complement 

RT component Factor B, a class III MHC antigen."; 

RL EMBO J. 3:153-157(1984). 

RN [14] 

RP SEQUENCE OF 1-99 FROM N.A. 

RC TISSUE=Blood; 

RX MEDLINE=87102880; PubMed=3643061; 

RA Wu L.C., Morley B.J., Campbell R.D.; 

RT "Cell-specific expression of the human complement protein factor B 



RT gene: evidence for the role of two distinct 5 '-flanking elements."; 

RL Cell 48:331-342(1987). 

RN [15] 

RP GLYCATION OF LYS-291. 

RX MEDLINE=91174758; PubMed=2006911 ; 

RA Niemann M.A. , Bhown A. S . , Miller E.J.; 

RT "The principal site of glycation of human complement factor B."; 

RL Biochem. J. 274:473-480(1991). 

CC -!- FUNCTION: Factor B which is part of the alternate pathway of the 

CC complement system is cleaved by factor D into 2 fragments: Ba and 

CC Bb. Bb, a serine protease, then combines with complement factor 3b 

CC to generate the C3 or C5 convertase. It has also been implicated 

CC in proliferation and differentiation of preactivated B 

CC lymphocytes, rapid spreading of peripheral blood monocytes, 

CC stimulation of lymphocyte blastogenesis and lysis of erythrocytes. 

CC Ba inhibits the proliferation of preactivated B lymphocytes. 

CC -!- CATALYTIC ACTIVITY: Cleaves C3 in the alpha-chain to yield C3a and 

CC C3b. Cleaves C5 in the alpha-chain to yield C5a and C5b. Both 

CC cleavages take place at the C-terminal of an arginine residue. 

CC -!- SUBUNIT: Monomer. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event=Alternative splicing; Named isoforms=2; 

CC Name=l ; 

CC IsoId=P00751-l; Sequence=Displayed; 

CC Name=2 ; 

CC IsoId=P00751-2; Sequence=VSP_00538 0, VSP_005381; 

CC -!- POLYMORPHISM: Two major variants, F and S, and 2 minor variants, 

CC as well as at least 14 very rare variants, have been identified. 

CC -!- SIMILARITY: Belongs to peptidase family SI. 

CC -!- SIMILARITY: Contains 3 Sushi (SCR) domains. 

CC -!- SIMILARITY: Contains 1 VWFA domain. 

CC ■ 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X72875; CAA51389.1; -. 

DR EMBL; S67310; AAD13989.1; 

DR EMBL; L15702; AAA16820.1; -. 

DR EMBL; X00284; CAA25077.1; 

DR EMBL; AF019413; AAB67977.1; -. 

DR EMBL; AF349679; AAK30167.1; -. 

DR EMBL; AF551848; AAN71991.1; -. 

DR EMBL; BC004143; AAH04143.1; -. 

DR EMBL; BC007990; AAH07990.1; 

DR EMBL; K01566; AAA36225.2; -. 

DR EMBL; JO 012 5; -; N0T_ANN0TATED_CDS . 

DR EMBL; J00126; AAA36226.1; -. 

DR EMBL; J00185; AAA36219.1; ALT_SEQ. 

DR EMBL; J00186; AAA36220.1; -. 

DR EMBL; M15082; AAA59625.1; -. 

DR PIR; S34075; BBHU. 



DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



HSSP; P20231; 1AA0. 
MEROPS; SOI . 196; 
SWISS-2DPAGE; P00751; HUMAN. 
Siena-2DPAGE; P00751; 
Genew; HGNC:1037; BF. 
MIM; 138470; 

GO; GO: 0003811; F: complement activity; TAS. 



InterPro; IPR009003; 
InterPro; IPR001254; 
InterPro; IPR001314; 
InterPro; IPR000436; 
InterPro; IPR002035; _ 
Pfam; PF00084; sushi; 3. 
Pfam; PF00089; trypsin; 1. 



Cys_Ser_trypsin . 
Peptidase_Sl . 
Peptidase_SlA. 
Sushi_SCR_CCP. 
VWF A. 



Query Match 5.8%; Score 83; DB 1; Length 764; 

Best Local Similarity 24.1%; Pred. No. 11; 

Matches 49; Conservative 21; Mismatches 71; Indels 62; Gaps 



12 



Qy 

Db 

Qy 

Db 

Qy 

Db 



24 GTGLYPMRGPFKNLALLPFSLPLLGGG GSGSGEKVSV 60 

I : I I IhllIMM Mill: 

2 GSNLSP QLCLMPFILGLLSGGVTTTPWSLARPQGSCSLEGVEIKGGSFRLLQEG 55 

61 S KMAAAW P S G — P SAP EAVTARLVGVLW FVS VTT G P WGAVAT S AGGEESLKC — 110 

: I I I I : I I : | | | : | : : I 

56 QALEYVCPSGFYPYPVQTRTCR STGSWSTLKTQDQKTVRKAECRAIHCPR 105 

111 -EDLKVGQYICKDPKINDATQEPVNC-TNYTAHVSCFPAPNITCKDSS— GNETHFTGNE 166 

I : I : I : I I : : : I II I III:: : I I 

106 PHDFENGEYWPRSPYYNVSDEISFHCYDGYTLRGSA NRTCQVNGRWSGQTAICDNG 161 



Qy 



Db 



167 VGFFK PISCRNVNGYSYKV 185 

I : II III I : : 

162 AGYCSNPGIPIGTRKV-GSQYRL 183 



RESULT 10 




LAMP_ 


HUMAN 




ID 


LAMP HUMAN STANDARD; PRT; 338 AA. 




AC 


Q13449; 




DT 


01-NOV-1997 (Rel. 35, Created) 




DT 


01-NOV-1997 (Rel. 35, Last sequence update) 




DT 


10-OCT-2003 (Rel. 42, Last annotation update) 




DE 


Limbic system-associated membrane protein precursor 


(LSAMP) . 


GN 


LSAMP OR LAMP. 




OS 


Homo sapiens (Human) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


Euteleostomi ; 


OC 


Mammalia; Eutheria; Primates; Catarrhini; Hominidae; 


Homo. 


OX 


NCBI TaxID=9606; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RX 


MEDLINE=96235133; PubMed=8 666243 ; 




RA 


Pimenta A.F., Fischer I., Levitt P.; 




RT 


" cDNA cloning and structural analysis of the human limbic- system- 


RT 


associated membrane protein (LAMP)."; 




RL 


Gene 170:189-195(1996). 




CC 


-!- FUNCTION: MEDIATES SELECTIVE NEURONAL GROWTH AND 


i AXON TARGETING. 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



CONTRIBUTES TO THE GUIDANCE OF DEVELOPING AXONS AND REMODELING OF 
MATURE CIRCUITS IN THE LIMBIC SYSTEM. ESSENTIAL FOR NORMAL GROWTH 
OF THE HYPPOCAMPAL MOSSY FIBER PROJECTION (BY SIMILARITY) . 

-!- SUBCELLULAR LOCATION: Attached to the membrane by a GPI-anchor. 

-!- TISSUE SPECIFICITY: Expressed on limbic neurons and fiber tracts 
as well as in single layers of the superior colliculus, spinal 
chord and cerebellum. 

-!- SIMILARITY: Belongs to the immunoglobulin superfamily. IgLON 
family. 

-!- SIMILARITY: Contains 3 immunoglobulin-like C2-type domains. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; U41901; AAC50569.1; -. 
PIR; JC4776; JC4776. 
Genew; HGNC:67 05; LSAMP. 
MIM; 603241; -. 

GO; GO: 0007399; P : neurogenesis ; TAS. 
InterPro; IPR007110; Ig-like. 
InterPro; IPR003598; Ig_c2. 
Pfam; PF00047; ig; 3. 
SMART; SM00408; IGc2; 2. 
PROSITE; PS50835; IG_LIKE; 3. 

Immunoglobulin domain; Cell adhesion; Glycoprotein; GPI-anchor; 
Repeat; Signal; Lipoprotein. 



SIGNAL 
CHAIN 

PROPEP 

DOMAIN 

DOMAIN 

DOMAIN 

DISULFID 

DISULFID 

DISULFID 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

LIPID 



1 
29 

316 
29 
132 
219 
53 
153 
239 
40 
66 
136 
148 
279 
287 
300 
315 
315 



28 
315 

338 
122 
214 
304 
111 
197 
290 
40 
66 
136 
148 
279 
287 
300 
315 
315 



SEQUENCE 338 AA; 37308 MW; 



POTENTIAL. 

LIMBIC SYSTEM-ASSOCIATED MEMBRANE 
PROTEIN. 

REMOVED IN MATURE FORM (POTENTIAL) . 
IG-LIKE C2-TYPE 1. 
IG-LIKE C2-TYPE 2. 
IG-LIKE C2-TYPE 3. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
GPI-anchor amidated 
(Potential) . 

03455F286DF5D92F CRC64; 



(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
asparagine 



Query Match 5.7%; Score 82.5; DB 1; Length 338; 

Best Local Similarity 29.6%; Pred. No. 4.6; 

Matches 37; Conservative 14; Mismatches 47; Indels 27; Gaps 7; 



Qy 101 SAGGEESLKCEDLKVG QYICKDPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSS 156 

: I : Mill I :: I :|| I : I I hi :: 

Db 230 TTGRQASLKCEASAVPAPDFEWYRDDTRINSANGLEIKSTE GQS SLTVTNVT-E EH Y 285 

Qy 157 GNETHFTGNEVG FFKPISCRNVNGYSYKVAVALSLFLGWLGADRFYLGYPALG 209 

II I I::| I : I I I : I I I : I I I II I I 

Db 286 GNYTCVAANKLGVTNASLVLFRPGSVRGING-SISLAVPL WL LAASLLC 333 

Qy 210 LLKFC 214 

I I I 

Db 334 LLSKC 338 



RESULT 11 
GATA RHIME 



ID GAT A_RH IME STANDARD; PRT; 493 AA. 

AC Q92QK7; 

DT 15-MAR-2004 (Rel. 43, Created) 

DT 15-MAR-2004 (Rel. 43, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Glutamyl-tRNA(Gln) amidotransf erase subunit A (EC 6.3.5.-) (Glu-ADT 

DE subunit A) . 

GN GATA OR R01312 OR SMC01352. 

OS Rhizobium meliloti (Sinorhizobium meliloti) . 

OC Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales; 

OC Rhizobiaceae; Sinorhizobium/Ensif er group; Sinorhizobium. 

OX NCBI_TaxID=382; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=1021; 

RX MEDLINE-21396507; PubMed=11481430 ; 

RA Capela D. f Barloy-Hubler F., Gouzy J., Bothe G., Ampe F. , Batut J., 

RA Boistard P., Becker A., Boutry M. , Cadieu E . , Dreano S., Gloux S., 

RA Godrie T . , Goffeau A., Kahn D., Kiss E., Lelaure V., Masuy D., 

RA Pohl T., Portetelle D . , Puehler A., Purnelle B., Ramsperger U., 

RA Renard C, Thebault P., Vandenbol M. , Weidner S., Galibert F . ; 

RT "Analysis of the chromosome sequence of the legume symbiont 

RT Sinorhizobium meliloti strain 1021."; 

RL Proc. Natl. Acad. Sci. U.S.A. 98:9877-9882(2001). 

CC -!- FUNCTION: Furnishes a means for formation of correctly charged 

CC Gln-tRNA(Gln) through the transamidation of misacylated Glu- 

CC tRNA(Gln) in organisms which lack glutaminyl-tRNA synthetase. The 

CC reaction takes place in the presence of glutamine and ATP through 

CC an activated gamma-phospho-Glu-tRNA(Gln) (By similarity) . 

CC -!- CATALYTIC ACTIVITY: ATP + L-glutamyl-tRNA(Gln) + L-glutamine = ADP 

CC + phosphate + L-glutaminyl-tRNA(Gln) + L-glutamate. 

CC -!- SUBUNIT: Heterotrimer of A, B and C subunits (By similarity). 

CC -!- SIMILARITY: Belongs to the amidase family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 



DR EMBL; AL591786; CAC45891.1; -. 

DR HAMAP; MF_00120; -; 1. 

DR InterPro; IPR000120; Amidase. 

DR InterPro; IPR004412; GatA. 

DR Pfam; PF01425; Amidase; 1. 

DR TIGRFAMs; TIGR00132; gatA; 1. 

DR PROSITE; PS00571; AMI DAS ES ; 1. 

KW Protein biosynthesis; Ligase; Complete proteome. 

SQ SEQUENCE 493 AA; 52654 MW; 1B7D595A197EF425 CRC64; 

Query Match 5.7%; Score 81.5; DB 1; Length 493; 

Best Local Similarity 21.1%; Pred. No. 8.9; 

Matches 60; Conservative 38; Mismatches 116; Indels 71; Gaps 15; 

Qy 3 I LKGS PNVI PRAHGQKNTRRDGTGLY- PMRGP FK NLALLPFSLPLLGGGGSGSGEK 57 

:: I I : III I I I : : I : : I I I : I II II 

Db 117 VMLGKLNMDEFAMGSSNE TSYYGPVKNPWRAKGSNLDLVP GGSSGGSAAA 166 

Qy 58 VS VS KMAAAWP S GP SAPEAVTARLVGVLWFVS VTTG P WGAVAT S AGGE E S L K- C ED L 113 

I : II: I I : : I I I I I I : : : : : I : 

Db 167 VAARLC AGAT AT DT GG S I RQ PAAFT GT VG - IKPTYGRCS RWGWAFAS S L DQAG P I ARD V 225 

Qy 114 KVGQYICK DPKINDATQEPVNCTNYTAHVSCFPAPNITCKDSSGNETHFTGNEVG 168 

: : I I I I I I : : I I : I : I . 

Db 22 6 RDAAILLKSMASIDPK— DTTSVDLPVPDYEAAIG QSIKGMRIG 267 

Qy 169 FFKPISCRNVNGYSYKVAVALSLFLGWL GADRFYLGYPALGLLKFCTVGFCGIGSLI 225 

I I : I : : I I I I : : I 
Db 268 I PKEY RVDGMPEDI EALWQQGI AWLRDAGAEI VDI SLPH T 307 

Qy 22 6 DFILISMQIVGPSDGSSYIIDY YGTRLTRLSITNETFRKTQ 2 66 

: I : | | | : : M : I III: I : : I I : 

Db 308 KYALPAYYIVAPAEASSNLARYDGVRYGLRVDGKDII-DMYEKTR 351 



RESULT 12 
VG32_BPMD2 

ID VG32_BPMD2 STANDARD; PRT; 223 AA. 

AC 064226; 

DT 15-DEC-1998 (Rel. 37, Created) 

DT 15-DEC-1998 (Rel. 37, Last sequence update) 

DT 15-DEC-1998 (Rel. 37, Last annotation update) 

DE Gene 32 protein (GP32) . 

GN 32. 

OS Mycobacteriophage D2 9. 

OC Viruses; dsDNA viruses, no RNA stage; Caudovirales ; Siphoviridae . 

OX NCBI_TaxID=28369; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=98300335; PubMed=9636706; 

RA Ford M.E., Sarkis G.J., Belanger A.E., Hendrix R.W., Hatfull G.F.; 

RT "Genome structure of mycobacteriophage D29: implications for phage 

RT evolution."; 

RL J. Mol. Biol. 279:143-164(1998). 

CC , 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 



CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC . 

DR EMBL; AF022214; AAC18473.1; -. 

DR PIR; F72803; F72803. 

SQ SEQUENCE 223 AA; 21822 MW; 33CD0DC310038AD4 CRC64; 

Query Match 5.6%; Score 80.5; DB 1; Length 223; 

Best Local Similarity 30.7%; Pred. No. 4.2; 

Matches 27; Conservative 8; Mismatches 36; Indels 17; Gaps 3; 

Qy 29 PMRGPFKNLALLPFSLP — LLGGGGS GS GEKVSVS KMAAAWPS GPSAPEAVTA 79 

I : I : : : : I I I I I I I I I I I I III 

Db 37 PVLTPVTAVGAYTYNIPAQAEFIDVILLGAGGGGQG MGSATAWGQGGFGGSWVTA 91 



Qy 80 RL VGVLWFVS VTTGPWGAVAT S AGG 104 

I I : I I : II II I : I 
Db 92 TLRRGVDI PWAVTQITGVI GAGGTAGPG 119 



RESULT 13 
ENT1_HUMAN 

ID ENT 1_HUMAN STANDARD; PRT; 455 AA. 

AC Q99808; Q9UJY2; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Equilibrative nucleoside transporter 1 (Equilibrative 

DE nitrobenzylmercaptopurine riboside-sensitive nucleoside transporter) 

DE (Equilibrative NBMPR-sensitive nucleoside transporter) (Nucleoside 

DE transporter, es-type) . 

GN SLC29A1 OR ENTl . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. , AND SEQUENCE OF 1-21. 

RC TISSUE=Placenta; 

RX MEDLINE-97140266; PubMed=8 98 674 8; 

RA Griffiths M. , Beaumont N . , Yao S.Y.M., Sundaram M. , Boumah C.E., 

RA Davies A., Kwong F.Y.P., Coe I., Cass C.E., Young J.D., Baldwin S.A. ; 

RT "Cloning of a human nucleoside transporter implicated in the cellular 

RT uptake of adenosine and chemotherapeutic drugs."; 

RL Nat. Med. 3:89-93(1997). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE= Jejunum, and Small intestine; 

RX MEDLINE=20216090; PubMed=10755314 ; 

RA Lum P.Y., Ngo L.Y., Bakken A.H., Unadkat J.D.; 

RT "Human intestinal es nucleoside transporter: molecular 

RT characterization and nucleoside inhibitory profiles."; 

RL Cancer Chemother. Pharmacol. 45:273-278(2000). 

RN [3] 



RP SEQUENCE FROM N.A. 

RA Graham K.A., Coe I.R., Carpenter P., Baldwin S.A., Young J.D., 

RA Cass C. E. ; 

RT "Genomic sequence of the human equilibrative nucleoside transporter 1 

RT (hENTl ) . " ; 

RL Submitted (SEP-1999) to the EMBL/GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22272396; PubMed=1238458 0 ; 

RA Sankar N., Machado J., Abdulla P., Hilliker A. J., Coe I.R.; 

RT "Comparative genomic analysis of equilibrative nucleoside transporters 

RT suggests conserved protein structure despite limited sequence 

RT identity."; 

RL Nucleic Acids Res. 30:4339-4350(2002). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Colon, and Muscle; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E. A. , Grouse L.H., Derge J.G., 

RA Klausner R. D. , Collins F.S., Wagner L . , Shenmen CM., Schuler G.D., 

RA Altschul S.F. f Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H. , Moore T., Max S.I., Wang J., Hsieh F. , 

RA Diatchenko L., Marusina K. , Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T . B - f Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A. , McEwan P.J., McKernan K.J*, Malek J. A. , Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

cc _|_ FUNCTION: MEDIATES BOTH INFLUX AND EFFLUX OF NUCLEOSIDES ACROSS 

CC THE MEMBRANE (EQUILIBRATIVE TRANSPORTER). IT IS SENSITIVE (ES) TO 

CC LOW CONCENTRATIONS OF THE INHIBITOR NITROBENZYLMERCAPTOPURINE 

CC RIBOSIDE (NBMPR) AND IS SODIUM- INDEPENDENT . IT HAS A HIGHER 

CC AFFINITY FOR ADENOSINE. INHIBITED BY DIPYRIDAMOLE AND DILAZEP 

CC (ANTICANCER CHEMOTHERAPEUTICS DRUGS) . 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: EXPRESSED IN HEART, BRAIN, MAMMARY GLAND, 

CC ERYTHROCYTES AND PLACENTA, AND ALSO IN FETAL LIVER AND SPLEEN. 

CC -!- PTM: Glycosylated. 

CC -!- SIMILARITY: BELONGS TO THE SLC29A FAMILY OF TRANSPORTERS. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 



cc 














DR 


EMBL; U81375; AAC51103.1; - 






DR 


EMBL; AF079117; 


AAC62495. 1, 






DR 


EMBL; AF190884; 


AAF02777. 1, 






DR 


EMBL; AF495730; 


AAM11785. 1, 






DR 


EMBL; BC001382; 


AAH01382. 1, 






DR 


EMBL; BCOO 


8954; 


AAH08954. 1, 






DR 


Genew; HGNC: 11003; 


SLC29A1 






DR 


MIM; 602193; -. 










DR 


GO; GO: 0005887; 


C: 


integral 


tc 


plasma membrane; TAS, 


DR 


GO; GO: 0005624; 


C: 


membrane 


fraction; TAS . 


DR 


GO; GO:0005337; 


F: 


nucleoside 


transporter activity; TAS. 


DR 


GO; GO: 0006139; 


P: 


nucleobase, 


nucleoside, nucleotide and nucl . 


DR 


GO; GO: 0015858; 


P: 


nucleoside 


transport; TAS. 


DR 


InterPro; 


IPR002259; DER/eqnu 


transpt . 


DR 


Pfam; PF01733; Nucleoside tran; 1. 


DR 


PRINTS; PR01130; 


DERENTRNSPRT 




DR 


ProDom; PD005102 


; 


DER/eqnu transpt; 1. 


DR 


TIGRFAMs; 


TIGR00939; 2a57; 


1. 




KW 


Transmembrane; Transport; Glycoprotein. 


FT 


INIT__MET 


0 




0 






FT 


DOMAIN 


1 




11 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


12 




28 




POTENTIAL. 


FT 


DOMAIN 


29 




81 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


82 




106 




POTENTIAL. 


FT 


DOMAIN 


107 




110 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


111 




129 




POTENTIAL. 


FT 


DOMAIN 


130 




137 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


138 




156 




POTENTIAL. 


FT 


DOMAIN 


157 




173 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


174 




198 




POTENTIAL. 


FT 


DOMAIN 


199 




205 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


206 




226 




POTENTIAL. 


FT 


DOMAIN 


227 




290 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


291 




310 




POTENTIAL. 


FT 


DOMAIN 


311 




322 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


323 




341 




POTENTIAL. 


FT 


DOMAIN 


342 




358 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


359 




377 




POTENTIAL. 


FT 


DOMAIN 


378 




392 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


393 




412 




POTENTIAL. 


FT 


DOMAIN 


413 




430 




CYTOPLASMIC (POTENTIAL). 


FT 


TRANSMEM 


431 




451 




POTENTIAL. 


FT 


DOMAIN 


452 




455 




EXTRACELLULAR (POTENTIAL) . 


FT 


CARBOHYD 


47 




47 




N-LINKED (GLCNAC. . .) (POTENTIAL 


SQ 


SEQUENCE 


455 AA; 


50088 MW; 


9098E95E26515850 CRC64; 



Query Match 5.6%; Score 80.5; DB 1; Length 455; 

Best Local Similarity 22.6%; Pred. No. 9.9; 

Matches 52; Conservative 39; Mismatches 82; Indels 57; Gaps 14; 

Qy 30 MRGPFKN LA-LLP — FSLPLLGGGGSGSGEKVSVSKMAAAWPSGPSAPEA VTARL 81 

: : I I I I I I : : I : : I I : I I I : I I II I : : I I 

Db 156 LQGSLFGLAGLLPASYTAPIMSGQGL-AGFFASVA-MICAIASGSELSESAFGYFITACA 213 

Qy 82 VGVLWFVSVTTGP WGAVATSAGGEESLKCEDLKVGQYICKDPK IND 127 

I : I : I : : I I : I : : I : : I : : : 



Db 



214 VI ILTI I CYLGLPRLEFYRYYQQLKLEGPGEQETKLDLI SKGE EPRAGKEESGVSV 269 



Qy 128 ATQEPVN CTNYTAHVSCFPAPNITCKDS-SGNETHFTGNEV 167 

: : I I I : I : I I I : I I : I : I 
Db 270 SNSQPTNESH S I KAI LKN I S VLAFS VCFI FT IT I GMFPAVTVEVKS S I AGS STW E 324 

Qy 168 GFFKP I S C- RNWG YS YKVAVALS LFLGWLGADRFYLGYPALGLLKFCTV 216 

: I I : II I : : : : I : I I I : I I : I I : I 

Db 325 RYFIPVSCFLTFNIFDWLGRSLTAVFM-WPGKDSRWL — PSLVLARLVFV 371 



RESULT 14 
IBP2_HUMAN 

ID IBP2_HUMAN STANDARD; PRT; . 328 AA. 

AC P18065; Q14619; 

DT 01-NOV-1990 (Rel. 16, Created) 

DT 01-NOV-1990 (Rel. 16, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Insulin-like growth factor binding protein 2 precursor (IGFBP-2) 

DE (IBP-2) (IGF-binding protein 2). 

GN IGFBP2 OR BP2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Retina; 

RX MEDLINE=91293227; PubMed-17 12312 ; 

RA Agarwal N., Hsieh C.L., Sills D., Swaroop M., Desai B., Francke U., 

RA Swaroop A. ; 

RT "Sequence analysis, expression and chromosomal localization of a 

RT gene, isolated from a subtracted human retina cDNA library, that 

RT encodes an insulin-like growth factor binding protein ( IGFBP2 ) . " ; 

RL Exp. Eye Res. 52:549-561(1991). 

RN [2] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 40-77. 

RX MEDLINE=90368661; PubMed=1697583 ; 

RA Zapf j., Kiefer M. , Merryweather J., Masiarz F. , Bauer D., Born W., 

RA Fischer J. A., Foresch E.R.; 

RT "Isolation from adult human serum of four insulin-like growth factor 

RT (IGF) binding proteins and molecular cloning of one of them that is 

RT increased by IGF I administration and in extrapancreatic tumor 

RT hypoglycemia . " ; 

RL J. Biol. Chem. 265:14892-14 898(1990). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Fetal liver; 

RX MEDLINE=90060007; PubMed=2479552 ; 

RA Binkert C, Landwehr J., Mary J.L., Schwander J., Heinrich G. ; 

RT "Cloning, sequence analysis and expression of a cDNA encoding a novel 

RT insulin-like growth factor binding protein (IGFBP-2)."; 

RL EMBO J. 8:2497-2502(1989). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Placenta; 

RX MEDLINE=91248211; PubMed-17 10112 ; 



RA Ehrenborg E., Vilhelmsdotter S . , Bajalica S., Larsson C, Sterm I., 

RA Koch J. f Brondum-Nielsen K. , Luthman H. ; 

RT "Structure and localization of the human insulin-like growth factor- 

RT binding protein 2 gene."; 

RL Biochem. Biophys . Res -. Commun. 176:1250-1255(1991). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Placenta; 

RX MEDLINE=92293159; PubMed-13764 11 ; 

RA Binkert C, Margot J.B., Landwehr J., Heinrich G. , Schwander J.; 

RT "Structure of the human insulin-like growth factor binding protein-2 

RT gene . " ; 

RL Mol. Endocrinol. 6:82 6-836(1992). 

RN [6] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain / and Uterus; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T . , Max S.I., Wang J. , Hsieh F., 

RA Diatchenko L., Marusina K. , Farmer A. A. , Rubin G.M. , Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A. , Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A. , McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S. f Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C-, 

RA Rodriguez A.C., Grimwood J., Schmutz J. , Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci . U.S.A. 99:16899-16903(2002). 

CC -!- FUNCTION: IGF-binding proteins prolong the half-life of the IGFs 

CC and have been shown to either inhibit or stimulate the growth 

CC promoting effects of the IGFs on cell culture. They alter the 

CC interaction of IGFs with their cell surface receptors. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- MISCELLANEOUS: Binds IGF-II more than IGF-I . 

CC -!- SIMILARITY: Contains 1 IGFBP domain. 

CC -!- SIMILARITY: Contains 1 thyroglobulin type-I domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; S37730; AAB22308.1; -. 

DR EMBL; S37712; AAB22308.1; JOINED. 

DR EMBL; S37722; AAB22308.1; JOINED. 



DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



EMBL; S37726; AAB22308.1; JOINED. 
EMBL; M35410; AAA03246.1; -. 
EMBL; X16302; CAA34373.1; -. 
EMBL; M69241; AAA36048.1; -. 
EMBL; M69237; AAA36048.1; JOINED. 
EMBL; M69239; AAA36048.1; JOINED. 
EMBL; M69240; AAA36048.1; JOINED. 
EMBL; A09809; CAA00862.1; -. 
EMBL; BC004312; AAH04312.1 
EMBL; BC009902; AAH09902.1 
EMBL; BC012769; AAH12769.1 
PIR; A41927; A41927. 
HSSP; P24593; 1BOE. 
Genew; HGNC:5471; IGFBP2 . 
MIM; 146731; 

InterPro; IPR009030; Grow_f ac_recep . 

InterPro; IPR000867; Insl_gro_f ac_pr . 

InterPro; IPR000716; Thyroglobulin_l . 

Pfam; PF00219; IGFBP; 1. 

Pfam; PF00086; thyroglobulin_l; 1. 

PIRSF; PIRSF001969; IGFBPl-6; 1. 

SMART; SM00121; IB; 1. 

SMART; SM00211; TY; 1. 

PROSITE; PS00222; IGF_BINDING; 1. 

PROSITE; PS00484; THYROGLOBULIN_l ; 1. 

Growth factor binding; Signal. 



SIGNAL 


1 


39 




CHAIN 


40 


328 


INSULIN-LIKE GROWTH FACTOR BINDING 








PROTEIN 2. 


DOMAIN 


260 


309 


THYROGLOBULIN TYPE-I . 


SITE 


304 


306 


CELL ATTACHMENT SITE. 


CONFLICT 


60 


60 


P -> R (IN REF. 4) . 


CONFLICT 


320 


320 


R -> C (IN REF. 3) . 


CONFLICT 


323 


323 


H -> D (IN REF. 4) . 


SEQUENCE 


328 AA; 


35137 MW; 


4E6BDF6D805C8853 CRC64; 


Query Match 




5.6%; 


Score 80; DB 1; Length 328; 



Best Local Similarity 27.8%; Pred. No. 7.4; 

Matches 40; Conservative 10; Mismatches 48; Indels 46; Gaps 



8 



Qy 

Db 



22 RDGTGLYPMRGPFKNLALLPFSLPLLGGGGSGSGEKVSV SKMAAAWPSGPS 72 

II 1:1 I I I I I I I I I I I : I : : | | | | 

4 RVGCPALPLPPP-PLLPLLPLLLLLLGASGGGGGARAEVLFRCPPCTPERLAACGPP-PV 61 



Qy 

Db 

Qy 

Db 



73 APEAVTARLVGVLWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYICKDPKIN DA 12 8 

III I I I I I I : I : I :: I I 

62 APPA AVAAVAGGAR-MPCAEL VREPGCGCCSVCA 94 

12 9 TQEPVNCTNYTAH VSCFPAP 14 8 

I III : I : I I 

95 RLEGEACGVYTPRCGQGLRCYPHP 118 



RESULT 15 
LIPP_PIG 

ID LIPP_PIG STANDARD; PRT; 450 AA. 

AC P00591; 



DT 21-JUL-1986 (Rel. 01, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Triacylglycerol lipase, pancreatic (EC 3.1.1.3) (Pancreatic lipase) 

DE (PL). 

GN PNLIP. 

OS Sus scrofa (Pig) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Suina; Suidae; Sus. 

OX NCBI_TaxID=9823; 

RN [1] 

RP SEQUENCE OF 308-449. 

RX MEDLINE= 82113655; PubMed=7 3 26260; 

RA de Caro J.D., Boudouard M. , Bonicel J. J., Guidoni A. A., Desnuelle P., 

RA Rovery M. ; 

RT "Porcine pancreatic lipase. Completion of the primary structure."; 

RL Biochim. Biophys . Acta 671:129-138(1981). 

RN [2] 

RP SEQUENCE OF 1-234, AND CARBOHYDRATE- LINKAGE SITE. 

RX MEDLINE=7 9236335; PubMed=380992 ; 

RA Bianchetta J.D., Bidaud J., Guidoni A. A. , Bonicel J. J., Rovery M. ; 

RT "Porcine pancreatic lipase. Sequence of the first 234 amino acids of 

RT the peptide chain."; 

RL Eur. J. Biochem. 97:395-405(1979). 

RN [3] 

RP SEQUENCE OF 235-307. 

RX MEDLINE-8 0088446; PubMed=518 929; 

RA Guidoni A. A., Bonicel J. J., Bianchetta J.D., Rovery M. ; 

RT "Porcine pancreatic lipase. Sequence between the 235th and 307th 

RT amino acids."; 

RL Biochimie 61:841-845(1979). 

RN [4] 

RP DISULFIDE BONDS. 

RX MEDLINE-83105095; PubMed=7151781; 

RA Benkouka F. , Guidoni A. A. , de Caro J.D., Bonicel J. J., 

RA Desnuelle P. A., Rovery M. ; 

RT "Porcine pancreatic lipase. The disulfide bridges and the sulfhydryl 

RT groups."; 

RL Eur. J. Biochem. 128:331-341(1982). 

RN [5] 

RP SUBSTRATE-BINDING SITE. 

RX MEDLINE=82000578; PubMed=6791692 ; 

RA Guidoni A. A. , Benkouka F., de Caro J.D., Rovery M. ; 

RT "Characterization of the serine reacting with diethyl p-nitrophenyl 

RT phosphate in porcine pancreatic lipase."; 

RL Biochim. Biophys. Acta 660:148-150(1981). 

RN [6] 

RP STRUCTURE OF CARBOHYDRATE. 

RX MEDLINE= 88082841; PubMed=3 6 91527; 

RA Fournet B., Leroy Y., Montreuil J., Decaro J., Rovery M. , 

RA van Kuik J. A., Vliegenthart J.F.G.; 

RT "Primary structure of the glycans of porcine pancreatic lipase."; 

RL Eur. J. Biochem. 170:369-371(1987). 

RN [7] 

RP X-RAY CRYSTALLOGRAPHY (2.8 ANGSTROMS), AND REVISIONS TO 30-32. 

RX MEDLINE-96279347; PubMed=8663362 ; 

RA Hermoso J., Pignol D., Kerfelec B . , Crenon I., Chapus C, 



RA Fontecilla-Camps J.C.; 

RT "Lipase activation by nonionic detergents. The crystal structure of 

RT the porcine lipase-colipase-tetraethylene glycol monooctyl ether 

RT complex."; 

RL J. Biol. Chem. 271:18007-18016(1996). 

CC -!- CATALYTIC ACTIVITY: Triacylglycerol + H{2)0 = diacylglycerol + a 

CC fatty acid anion. 

CC -!- SUBCELLULAR LOCATION: Secreted. 

CC -!- SIMILARITY: Belongs to the AB hydrolase superf amily . Lipase 
CC family. 

CC -!- SIMILARITY: Contains 1 PLAT domain. 

CC -!- DATABASE: NAME=Worthington enzyme manual; 

CC WWW="http : / /www. worthington-biochem. com/PL/" . 

DR PDB; 1ETH; 07-DEC-96. 

DR InterPro; IPR000734; Lipase. 

DR InterPro; IPR008262; Lipase_AS. 

DR InterPro; IPR001024; Lipoxygenase_LH2 . 

DR InterPro; IPR008976; PLAT_LH2 . 

DR InterPro; IPR000379; Ser_estrs. 

DR Pfam; PF00151; lipase; 1. 

DR Pfam; PF01477; PLAT; 1. 

DR PRINTS; PR00821; TAGLIPASE. 

DR SMART; SM00308; LH2 ; 1. 

DR PROSITE; PS00120; LIPASE_SER; 1. 

DR PROSITE; PS50095; PLAT; 1. 

KW Hydrolase; Lipid degradation; Pancreas; Glycoprotein; 3D-structure . 



FT 


DOMAIN 


339 


450 


PLAT. 


FT 


ACT_SITE 


153 


153 


CHARGE RELAY SYSTEM. 


FT 


ACT SITE 


177 


177 


CHARGE RELAY SYSTEM. 


FT 


ACT_SITE 


264 


264 


CHARGE RELAY SYSTEM. 


FT 


DISULFID 


4 


10 




FT 


DISULFID 


91 


102 


IN ISOMER 1. 


FT 


DISULFID 


91 


104 


IN ISOMER 2. 


FT 


DISULFID 


238 


262 




FT 


DISULFID 


286 


297 




FT 


DISULFID 


300 


305 




FT 


DISULFID 


434 


450 




FT 


CARBOHYD 


167 


167 


N-LINKED (GLCNAC. . .). 


SQ 


SEQUENCE 


450 AA; 


50084 MW; 76E13BBDB4541E0E CRC64; 


Query Match 




5.6%; 


Score 80; DB 1; Length 450; 


Best Local Similarity 


20.2%; 


Pred. No. 11; 


Matches 47; 


Conservative 


30; Mismatches 96; Indels 



60; Gaps 10; 

Qy 1 MHILKGSPNVIP RAHGQKNTRRDG TGLYPMRGPFKNLALL 40 

: : I I I I II: I : I I I I I I : I 

Db 133 VEVLKSSLGYSPSNVHVI GHSLGSHAAGEAGRRTNGTIERITGLDPAEPCFQGTPELVRL 192 

Qy 41 PFSLPLLGGGGSGSGEKVSVSKMAAAWPSG P SAP EAVTARLVGV 84 

: I : : MM MM I : : : : :| : 

Db 193 DPSDAKFVDVIHTDAAPIIPNLGFGMSQTVGHLDF FPNGGKQMPGCQKNILSQIVDI 249 

Qy 8 5 LWFVSVTTGPWGAVATSAGGEESLKCEDLKVGQYICKDPKINDATQEPVNCTNYTAHVS- 14 3 

II | : : | | : :| | :| I M : 

Db 250 DGIW EGTRDFVACNHLRSYKYYA-DSILNPDGFAGFPCDSYNVFTAN 295 



QY 



14 4 -CFPAPNITCKDSSGNETHFTG- 



NEVGFFKPISCRNVNGYSYKVAVALS 190 



Ill I: I II ::| : I : MM II 

Db 296 KCFPCPSEGCPQMGHYADRFPGKTNGVSQVFYLNTGDASNFARWRYKVSVTLS 348 



Search completed: March 4, 2004, 10:25:14 
Job time : 50 sees 



